Skip to Main Content

Chemical & Biological Engineering Research: Datasets

Last Updated: Oct 17, 2024 11:21 AM

Finding Datasets

Below is a non-exhaustive list of free and public sources for datasets.

Selected National Institutes of Health Datasets & Data Repositories

  • PubChem
    PubChem provides information on the biological activities and properties of over 92 million small molecules. It includes substance information, compound structures, and bioactivity data in three primary databases, Substance, Compound, and BioAssay, respectively. The Substance database contains more than 223 million records; the Compound database contains more than 92 million unique structures; and the BioAssay database contains more than 1.2 million bioassays. The databases can be searched by chemical name, Chemical Abstracts Service (CAS) Registry Number, keywords, and structure. PubChem is an initiative of the National Center for Biotechnology Information (NCBI) of the National Library of Medicine (NLM). For more information, see the PubChem FAQ page.
     
  • Mouse Phenome Database
    A collaborative standardized collection of measured data on laboratory mouse strains and populations. Its purpose is to characterize mouse strains and populations to facilitate translational discoveries and to assist in the selection of strains for experimental studies. Includes baseline phenotype data sets as well as studies of drug, diet, disease, and aging effects. Also includes protocols, projects and publications, and SNP, variation and gene expression studies.
     
  • All of Us Data
    The National Institutes of Health’s All of Us Research Program is building one of the largest biomedical data resources of its kind. The All of Us Research Hub stores health data from a diverse group of participants from across the United States. Registered users can use the Researcher Workbench to dive deeper into the data; conduct rapid, hypothesis-driven research; and build new methods for the future, using a variety of tools. The diverse data may help facilitate new studies that could help lead to new insights, treatments, and strategies for disease prevention that are tailored to individuals. 
     

Note: A complete listing of NIH Data Sharing Repositories is available at: https://www.nlm.nih.gov/NIHbmic/nih_data_sharing_repositories.html.