BU Research Data

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 67
  • Item
    Dataset: Modeling the concentration enhancement and selectivity of plastic particle transport in sea spray aerosols
    (2024) Dubitsky, Lena; Deane, Grant; Stokes, M. Dale; Bird, James C.
    This dataset was created from a combination of theoretical models investigating the concentration enhancement and selectivity of plastic particle transport in sea spray aerosols. Following the approach and assumptions outlined in a paper with the same title, the data reports the theoretical enrichment expected in film drops and jet drops on a per bubble size per particle size basis. The bounds of the enrichment depend on the attachment efficiency, which can vary from zero to one, and representative values for the model are calculated for both extremes. Furthermore, this enrichment combines with the bubble-size distribution modeled from a breaking wave along with the plastic particle size distribution modeled from observations to produce an estimate for the number flux of ejected particles per wave on a per bubble size per particle size basis. Calculation results are also provided for the particle surface area and volume flux on a per bubble size per particle size basis from bursting bubbles with and without scavenging (attachment efficiency one or zero). Finally, the ratios of the enrichment, number, area, and volume are computed when integrated across particle size and bubble size.
  • Item
    Raw data for Chapter 22 "Phytolith results from Tomb 16/H/50"
    (In: Megiddo VII: The Shmunis Excavations of Tomb 50 and Burial 45. Editors: Melissa S. Cradic, Matthew J. Adams, and Israel Finkelstein. Tel Aviv University Press., 2024-01-01) Wade, Kali R.; Marston, John M.
  • Item
    Mechanical MNIST – Unsupervised Learning Dataset
    (2023) Nguyen, Quan; Lejeune, Emma
    The Mechanical MNIST dataset collection contains Finite Element simulations of heterogeneous materials undergoing applied displacement. Here, we introduce a new benchmark dataset designed specifically for assessing unsupervised learning methods where the goal is to discover patterns from unlabeled data. To obtain this dataset, we generate displacement fields from Finite Element simulations and uniformly sample approximately 1500 displacement markers on each domain of interest. Since unsupervised learning aims to identify patterns in labeled data, we provide a dataset where the primary objective is to explore unlabeled data, while simultaneously providing “ground truth” information to ultimately evaluate the efficacy of different unsupervised learning approaches. It is important to note however, that in the intended applications of these methods, ground truth information will likely be absent, particularly in experimental studies of intricate heterogeneous soft tissue. Broadly speaking, this computationally generated dataset mimics the behavior of soft materials, while simultaneously providing ground truth information for method evaluation. In total, the dataset contains the following combinations of conditions: 6 different heterogeneous material patterns, 2 constitutive models, 4 controlled boundary conditions, and 1 random boundary condition. Here, we include the tutorials for our dataset with the name “dataset_tutorials.pdf”. This document contains the information to understand the contents of our dataset, as well as the instructions on how to use the data. The many options from our dataset should enable researchers to explore unsupervised learning methods on soft materials.
  • Item
    Dataset for: New Insights into the Role of Atmospheric Transport and Mixing on Column and Surface Concentrations of NO2 at a Coastal Urban Site
    (2023) Geddes, Jeffrey; Adams, Taylor
    This entry contains the raw nitrogen dioxide column abundance collected from Pandora direct sun spectrometer measurements at Boston University described in the publication: "New Insights into the Role of Atmospheric Transport and Mixing on Column and Surface Concentrations of NO2 at a Coastal Urban Site", published in the Journal of Geophysical Research: Atmospheres (https://doi.org/10.1029/2022JD038237).
  • Item
    Integration of point-of-care screening for type 2 diabetes mellitus and hypertension with COVID-19 rapid antigen screening in Johannesburg, South Africa
    (PLoS ONE, 2023-04-20) Brennan, Alana T.; Meyer-Rath, Gesine; Vetter, Beatrice; Majam, Mohammed; Msolomba, Vanessa; Venter, Francois; Carmona, Sergio; Gordon, Adena; Kao, Kekeletso
    Aims: We sought to evaluate the yield and linkage-to-care for diabetes and hypertension screening alongside a study assessing the use of rapid antigen tests for COVID-19 in taxi ranks in Johannesburg, South Africa. Methods: Participants were recruited from Germiston taxi rank. We recorded results of blood glucose (BG), blood pressure (BP), waist circumference, smoking status, height, and weight. Participants who had elevated BG (fasting>7.0; random>11.1mmol/L) and/or BP (diastolic>90 and systolic>140mmHg) were referred to their clinic and phoned to confirm linkage. Results: 1169 participants were enrolled and screened for diabetes and hypertension. Combining participants with a previous diagnosis of diabetes (n=23, 2%; 95% CI:1.3-2.9%) and those that had an elevated BG measurement (n=60, 5.2%; 95%CI:4.1-6.6%) at study enrollment, we estimated an overall indicative prevalence of diabetes of 7.1% (95% CI:5.7-8.7%). When combining those with known hypertension at study enrollment (n=124, 10.6%; 95%CI:8.9-12.5%) and those with elevated BP (n=202; 17.3%; 95%CI:15.2-19.5%), we get an overall prevalence of hypertension of 27.9% (95% CI:25.4-30.1%). Only 31.7% of those with elevated BG and 16.0% of those with elevated BP linked-to-care. Conclusion: By opportunistically leveraging existing COVID-19 screening in South Africa to screen for diabetes and hypertension, 24% of participants received a potential new diagnosis. We had poor linkage-to-care following screening. Future research should evaluate options for improving linkage-to-care, and evaluate the large-scale feasibility of this simple screening tool.
  • Item
    Dataset for: Implications of sea breezes on air quality monitoring in a coastal urban environment, evidence from high resolution modeling of NO2 and O3
    Geddes, Jeffrey; Wang, Bo
    This data contains observations and chemical transport model output relevant to an analysis of sea breeze impacts on local air quality in Boston. Model output are from a summer 2019 WRF-Chem simulation over the Northeast US, including the greater Boston region. Observational assets were compiled in order to evaluate the model performance.
  • Item
    Dataset: effects of salinity beyond coalescence on submicron aerosol distributions
    (2022-11-18) Dubitsky, Lena; Bird, James; Deane, Grant; Stokes, M. Dale
    This dataset was created from laboratory experiments investigating the effect of salinity on submicron aerosol production. Bubbles generated in solutions of sodium acetate and artificial seawater were tested with corresponding measurements of the submicron aerosol size distribution. The bubbling below and above the surface of the liquid are imaged.
  • Item
    Dataset for: Mate guarding by male orangutans in Gunung Palung National Park, Knott Lab
    Scott, Amy; Banes, Graham; Setiadi, Wuryantari; Saragih, Jessica; Susanto, Wahyu; Mitra Setia, Tatang; Knott, Cheryl
  • Item
    MAHMAZ maternity waiting home: setup cost dataset
    Juntunen, Allison; Scott, Nancy A.; Kaiser, Jeanette L.; Vian, Taryn; Ngoma, Thandiwe; Mataka, Kaluba K.; Bwalya, Misheck; Sakanga, Viviane; Kalaba, David; Biemba, Godfrey; Rockers, Peter C.; Hamer, Davidson H.; Long, Lawrence C.
    These datasets detail 1) the setup costs expended to set up 10 maternity waiting homes in rural Zambia and 2) the monthly occupancy of the maternity waiting homes. The former includes the date of purchase, cost category, and the purchase amount in Kwacha. The latter describes how many patients visited the maternity waiting home in the last year of our project. We utilized this data to create a manuscript describing the setup costs of these homes, and the cost per admission to the homes, to serve as a guide for future implementors.
  • Item
    Dataset: Observational study of the clinical performance of a Public-Private Partnership national referral hospital network in Lesotho: Do improvements last over time?
    Scott, Nancy A.; Kaiser, Jeanette L.; Jack, Brian W.; Nkabane-Nkholongo, Elizabeth L.; Juntunen, Allison; Nash, Tshema; Alade, Mayowa; Vian, Taryn
    Public-private partnerships (PPP) may increase healthcare quality but lack longitudinal evidence for success. The Queen ‘Mamohato Memorial Hospital (QMMH) in Lesotho is one of Africa's first healthcare PPPs. We compare data from 2012 and 2018 on capacity, utilization, quality, and outcomes to understand if early documented successes have been sustained using the same measures over time. In this observational study using administrative and clinical data, we assessed beds, admissions, average length of stay (ALOS), outpatient visits, and patient outcomes. We measured triage time and crash cart stock through direct observation in 2013 and 2020. Operational hospital beds increased from 390 to 410. Admissions decreased (-5.3%) while outpatient visits increased (3.8%). ALOS increased from 5.1 to 6.5 days. Occupancy increased from 82% to 99%; half of the wards had occupancy rates ≥90%, and Neonatal ward occupancy was 209%. The proportion of crash cart stock present (82.9% to 73.8%) and timely triage (84.0% to 27.6%) decreased. While overall mortality decreased (8.0% to 6.5%) and neonatal mortality overall decreased (18.0% to 16.3%), mortality among very low birth weight neonates increased (30.2% to 36.8%). Declines in overall hospital mortality are promising. Yet, continued high occupancy could compromise infection control and impede response to infections, such as COVID-19. High occupancy in the Neonatal ward suggests that the population need for neonatal care outpaces QMMH capacity; improvements should be addressed at the hospital and systemic levels. The increase in ALOS is acceptable for a hospital meant to take the most critical cases. The decline in crash cart stock completeness and timely triage may affect access to emergency treatment. While the partnership itself ended earlier than anticipated, our evaluation suggests that generally the hospital under the PPP was operational, providing high-level, critically needed services, and continued to improve patient outcomes. Quality at QMMH remained substantially higher than at the former Queen Elizabeth II hospital.
  • Item
    Isoprene and soil NOx impacts on nonattainment ozone from GEOS-CHem
    (2022) Geddes, Jeffrey
    Ozone (O3) is a criteria air pollutant that continues to pose a threat to more than one hundred million Americans each year, despite progress in regulating precursor emissions. In many O3-polluted areas, the role of natural emissions of isoprene in the production of ground-level O3 has been well recognized, but this chemistry depends strongly on local anthropogenic emissions which have been changing rapidly. We use an updated estimate of anthropogenic emissions to demonstrate that many areas that remain in nonattainment of the federally mandated O3 standard are now much less sensitive to natural isoprene emissions, with biogenic nitrogen oxide emissions from soils becoming more and more important. The role of these soil emissions on O3 in nonattainment areas has not been well characterized, but, as we show in our companion article in JGR-Atmospheres, this will become increasingly necessary for good O3 policy in nonattainment areas.
  • Item
    Mechanical MNIST – Distribution Shift
    (2022) Yuan, Lingxiao; Park, Harold S.; Lejeune, Emma
    The Mechanical MNIST – Distribution Shift dataset contains the results of finite element simulation of heterogeneous material subject to large deformation due to equibiaxial extension at a fixed boundary displacement of d = 7.0. The result provided in this dataset is the change in strain energy after this equibiaxial extension. The Mechanical MNIST dataset is generated by converting the MNIST bitmap images (28x28 pixels) with range 0 - 255 to 2D heterogeneous blocks of material (28x28 unit square) with varying modulus in range 1- s. The original bitmap images are sourced from the MNIST Digits dataset, (http://www.pymvpa.org/datadb/mnist.html) which corresponds to Mechanical MNIST – MNIST, and the EMNIST Letters dataset (https://www.nist.gov/itl/products-and-services/emnist-dataset) which correspond to Mechanical MNIST – EMNIST Letters. The Mechanical MNIST – Distribution Shift dataset is specifically designed to demonstrate three types of data distribution shift: (1) covariate shift, (2) mechanism shift, and (3) sampling bias, for all of which the training and testing environments are drawn from different distributions. For each type of data distribution shift, we have one dataset generated from the Mechanical MNIST bitmaps and one from the Mechanical MNIST – EMNIST Letters bitmaps. For the covariate shift dataset, the training dataset is collected from two environments (2500 samples from s = 100, and 2500 samples from s = 90), and the test data is collected from two additional environments (2000 samples from s = 75, and 2000 samples from s = 50). For the mechanism shift dataset, the training data is identical to the training data in the covariate shift dataset (i.e., 2500 samples from s = 100, and 2500 samples from s = 90), and the test datasets are from two additional environments (2000 samples from s = 25, and 2000 samples from s = 10).  For the sampling bias dataset, datasets are collected such that each datapoint is selected from the broader MNIST and EMNIST inputs bitmap selection by a probability which is controlled by a parameter r. The training data is collected from two environments (9800 from r = 15, and 200 from r = -2), and the test data is collected from three different environments (2000 from r = -5, 2000 from r = -10, and 2000 from r = 1).  Thus, in the end we have 6 benchmark datasets with multiple training and testing environments in each. The enclosed document “folder_description.pdf'” shows the organization of each zipped folder provided on this page. The code to reproduce these simulations is available on GitHub (https://github.com/elejeune11/Mechanical-MNIST/blob/master/generate_dataset/Equibiaxial_Extension_FEA_test_FEniCS.py). 
  • Item
    Dataset for: Mother-offspring proximity maintenance as an infanticide avoidance strategy in Bornean orangutans
    (2022-05-19) Scott, Amy; Knott, Cheryl; Mitra Setia, Tatang; Susanto, Tri Wahyu
  • Item
    Simplified clinical algorithm for identifying patints eligible for immediate initiation of antiretroviral therapy for HIV (SLATE): protocol for a randomised evaluation
    (Boston University, 2017-05) Rosen, Sydney; Fox, Matthew P.; Larson, Bruce A.; Brennan, Alana T.; Maskew, Mhairi; Tsikhutsu, Isaac; Ehrenkranz, Peter D.; Venter, Wd Francois
    INTRODUCTION: African countries are rapidly adopting guidelines to offer antiretroviral therapy (ART) to all HIV-infected individuals, regardless of CD4 count. For this policy of 'treat all' to succeed, millions of new patients must be initiated on ART as efficiently as possible. Studies have documented high losses of treatment-eligible patients from care before they receive their first dose of antiretrovirals (ARVs), due in part to a cumbersome, resource-intensive process for treatment initiation, requiring multiple clinic visits over a several-week period. METHODS AND ANALYSIS: The Simplified Algorithm for Treatment Eligibility (SLATE) study is an individually randomised evaluation of a simplified clinical algorithm for clinicians to reliably determine a patient's eligibility for immediate ART initiation without waiting for laboratory results or additional clinic visits. SLATE will enroll and randomize (1:1) 960 adult, HIV-positive patients who present for HIV testing or care and are not yet on ART in South Africa and Kenya. Patients randomized to the standard arm will receive routine, standard of care ART initiation from clinic staff. Patients randomized to the intervention arm will be administered a symptom report, medical history, brief physical exam and readiness assessment. Patients who have positive (satisfactory) results for all four components of SLATE will be dispensed ARVs immediately, at the same clinic visit. Patients who have any negative results will be referred for further clinical investigation, counseling, tests or other services prior to being dispensed ARVs. After the initial visit, follow-up will be by passive medical record review. The primary outcomes will be ART initiation ≤28 days and retention in care 8 months after study enrollment. ETHICS AND DISSEMINATION: Ethics approval has been provided by the Boston University Institutional Review Board, the University of the Witwatersrand Human Research Ethics Committee (Medical) and the KEMRI Scientific and Ethics Review Unit. Results will be published in peer-reviewed journals and made widely available through presentations and briefing documents.
  • Item
    Dataset for "Barriers and facilitators to facility-based delivery in rural Zambia: A qualitative study of women’s perceptions after implementation of an improved Maternity Waiting Homes intervention"
    Scott, Nancy A.; Fong, Rachel M.; Kaiser, Jeanette L.; Ngoma, Thandiwe; Vian, Taryn; Bwalya, Misheck; Sakanga, Viviane R.; Lori, Jody R.; Musonda, Gertrude; Munro-Kramer, Michelle L.; Rockers, Peter C.; Ahmed Mdluli, Eden; Biemba, Godfrey; Hamer, Davidson H.
    Objectives: Women in sub-Saharan Africa face well-documented barriers to facility-based deliveries. An improved maternity waiting homes (MWH) model was implemented in rural Zambia to bring pregnant women closer to facilities for delivery. We qualitatively assessed whether MWHs changed perceived barriers to facility delivery among remote-living women. Design: We administered in-depth interviews (IDIs) to a randomly-selected subsample of women in intervention (n=78) and control (n=80) groups who participated in the primary quasi-experimental evaluation of an improved MWH model. The IDIs explored perceptions and preferences of delivery location. We conducted content analysis to understand perceived barriers and facilitators to facility delivery. Setting and participants: Participants lived in villages 10+ kilometers from the health facility and had delivered a baby in the previous 12 months. Intervention: The improved MWH model was implemented at 20 rural health facilities. Results: Over 96% of participants in the intervention arm and 90% in the control arm delivered their last baby at a health facility. Key barriers to facility delivery were distance and transportation, and costs associated with delivery. Facilitators included no user fees, penalties for home delivery, desire for safe delivery, and availability of MWHs. Most themes were similar between study arms. Both discussed the role MWHs have in improving access to facility-based delivery. Intervention arm participants expressed that the improved MWH model encourages use and helps overcome the distance barrier. Control arm participants either expressed a desire for an improved MWH model or did not consider it in their decision-making. Conclusions: Even in areas with high facility-based delivery rates in rural Zambia, barriers to access persist. MWHs may be useful to address the distance challenge, but no single intervention is likely to address all barriers experienced by rural, low-resourced populations. MWHs should be considered in a broader systems approach to improving access in remote areas. Trial Registration: ClinicalTrials.gov Identifier: NCT02620436
  • Item
    Mechanical MNIST - Cahn-Hilliard
    (2022) Kobeissi, Hiba; Lejeune, Emma
    The Mechanical MNIST Cahn-Hilliard dataset contains the results of 104,813 Finite Element simulations of a heterogeneous material domain subject to large equibiaxial extension deformation. The heterogeneous domain patterns are generated from a Finite Element implementation of the Cahn-Hilliard equation. Different stripe and circle patterns are obtained by varying four simulation parameters: the initial concentration, the grid size on which the concentration is initialized, parameter $\lambda$, and $b$, the peak-to-valley value of the symmetric double-well chemical free-energy function. Binary bitmap images of 400 x 400 pixels are converted into two-dimensional meshed domains of binary material using the OpenCV library, Pygmsh, and Gmsh 4.6.0. We also include in this dataset the 104,813 patterns (37,523 from case 1, 37,680 from case 2, and 29,610 from case 3) used in the Finite Element simulations stored as binary images in text files. After pattern generation, the material domain is modeled as a unit square of Neo-Hookean binary material (high concentration areas correspond to Young's Modulus 10, low concentration areas correspond to Young's Modulus 1). For equibiaxial extension, each of the four edges of the domain is displaced to 50% of the initial domain size in the direction of the outward normal to the surface with fixed displacements (d = [0.0,0.001,0.1,0.2,0.3,0.4,0.5]). Here we provide the simulation results consisting of the following: (1) change in strain energy reported at each level of applied displacement, (2) total reaction force at the four boundaries reported at each level of applied displacement, and (3) full field displacement reported at the final applied displacement d=0.5. All Finite Element simulations are conducted with the FEniCS computing platform (https://fenicsproject.org). The code to reproduce these simulations (both pattern generation simulations and equibiaxial extension simulations) is hosted on GitHub (https://github.com/elejeune11/Mechanical-MNIST-Cahn-Hilliard). The enclosed document “description.pdf'” contains additional details.
  • Item
    Pioneer-Voyager-Galileo radio occultations
    (2022-02-09) Narvaez, Clara
    Figures of early radio occultations at Jupiter by Pioneer10, Pioneer11, Voyager1 and Voyager2 available via journals, as well as occultations from Galileo that have not been published before, have been digitized and made available.
  • Item
    Asymmetric Buckling Columns (ABC)
    (2022) Prachaseree, Peerasait; Lejeune, Emma
    The Asymmetric Buckling Columns (ABC) dataset contains spatially heterogeneous columns with fixed-fixed boundary conditions that are classified to be buckling left (label of 0) or right (label of 1). The dataset is split into 3 subdatasets: sub-dataset 1, sub-dataset 2, and sub-dataset 3, each with increasing levels of geometric complexity. For each sub-dataset, we provide information to reconstruct the domain geometry as txt files (subdataset*_geometry.zip), graphs from Simple Linear Iterative Clustering (SLIC) segmentation with varying degrees of node density as json files (subdatset*_sparse_graphs.zip, subdatset*_medium_graphs.zip, subdatset*_dense_graphs.zip) , and output labels as txt files (subdataset*_labels.zip). In brief, sub-dataset 1 is generated by stacking blocks with varying widths, sub-dataset 2 consists of overlapping rings of identical size, and sub-dataset 3 consists of overlapping and trimmed rings of varying sizes. For sub-dataset 1 geometry files, "x.txt'' indicates the centers of each block and "l.txt'' gives the length of each block. Each block is stacked top to bottom. For sub-dataset2, the files in folder "x'' and folder ``y'' give the "x'' an "y'' coordinates for each ring with inner radius of 0.15w and outer radius of 0.25w. Note that there is a different number of rings in each structure. For sub-dataset 3, "x.txt'' and "y.txt'' contain the "x'' and "y'' coordinate of each ring, and "outer.txt'' and "inner.txt'' give the ring outer thickness and the ratio of inner thickness to outer thickness respectively. Note that the coordinates for all domains are defined such that the origin is in the top left. Details on how to generate ground truth domain geometry with the provided geometric information and the corresponding graphs and finite element meshes are provided on GitHub (https://github.com/pprachas/ABC_dataset). The json graph files are the graphs used in the corresponding manuscript, and the code to load the json files with Pytorch and Pytorch Geometric is also provided. All subdatasets contain 25,000 simulation results. For the manuscript, 20,000 data points are used to train the ML model with 2,500 datapoints used for validation, and 2,500 datapoints held out as test data. The graphs provided are first shuffled and then split into train, validation and test data. Details of our protocol for shuffling and splitting the data are also provided on GitHub.
  • Item
    The multiplier effect of endogenous technical change accelerates the transition to photovoltaic cells
    (2022) Tendler, Anita C.; Kaufmann, Robert K.
    Abating emissions of carbon dioxide depends in part on how quickly the levelized cost of electricity (LCOE) from photovoltaic cells (PV) achieves grid parity without policy interventions. Reaching this threshold is accelerated by learning by doing, which reduces the LCOE generated by PV and increases installed capacity. Here, we expand previous estimates of unidirectional (Capacity → Price) learning curves to include the effect of prices on capacity by estimating a cointegrating vector autoregression (CVAR) model, which can capture a simultaneous relation between price and capacity. Results indicate that the simultaneous relation between price and capacity increases the estimate for the learning rate and creates a multiplier that amplifies static effects by nearly a factor of ten. This same multiplier effect enhances the ability of policies, such as a carbon tax, to lower the costs of PV, increase capacity, and lower carbon emissions. Together, these results suggest that grid parity is closer than indicated by unidirectional learning curves.