Recommended browser is Firefox, particularly for the BioCyc data visualization and analysis tools.
Certain features require a User Account. These include creating groups of genes and metabolites for analysis with SmartTables, customization of BioCyc pages, storing organismal sets for comparative analyses, and configuring default settings for the various Omics Viewers.
BioCyc organism datasets are divided into three tiers based upon their curation quality:
- The tier 1 organisms, Human, Escherichia coli K-12 MG165, Arabidopsis thaliana, Saccharomyces cerevisiae, Leishmania major Friedlin and Trypanosoma brucei are extensively manually (human) curated from the published literature.
- The ~50 organisms contained within Tier 2, have had their metabolic pathways and operons computationally derived, followed by manual curation. Currently Tier 2 organisms have undergone one year of literature-based curation.
- The largest majority of organisms are Tier 3 and solely computationally derived. Consequently, Tier 3 pathways should be treated with caution.
For detailed information on how the computational creation of the BioCyc, EcoCyc, and MetaCyc databases are created, see “A Guide to the BioCyc Database collection”.
Getting Help: BioCyc offers an extensive set of help guides to its organismal databases and its many analysis and data visualization tools:
- Website User’s Guide: The various tools and how to access them, including SmartTables, creating metabolic maps and models, comparative analysis, and more.
- Guided Tour: Examples of data present in BioCyc.
- BioCyc User Guide: Information about BioCyc content.
- Webinars: 10-40 minute videos covering basic and advanced topics, ranging from introductions to searching, pathways, reactions and compounds, to more advanced topics including creating your first SmartTable, the Omics Viewer, or creating your own pathway/genome database from an annotated genome file.
BioCyc, UB ONLY, is an integrated collection of more than 9,000 microbial plus human, Drosophila, mosquito, and mouse pathway/genome databases. Each organism is contained within its own “database”. The exact type of data available is variable, dependent upon the organism and the current state of knowledge for any given organism. All organism databases commonly contain:
- the organism’s annotated genome
- predicted metabolic pathways, including predicted operons for bacterial genomes
- predicted atom mappings permitting tracking atoms from reactants to products
- metabolic and genome overview “posters”
- genome browser
Dependent upon the organism, additional data elements could include:
- protein subcellular locations
- enzyme kinetics data
- protein features, including predicted Pfam families
- promotors, operons, transcription factor binding sites
- orthologs to other BioCyc genomes
- organism phenotype data
- links to other database
In addition, BioCyc contains several expanded databases:
- BsubCyc: A combination computationally derived and manually curated metabolic and regulatory network database for Bacillus subtilis 168 which includes 160 regulatory genes and 1,100 regulated genes. Based upon the sequence and annotation published by Barbe, V., et al, Microbiology 155: 1758-1775, 2009.
- EcoCyc: A literature-based curation comprehensive resource of the Escherichia coli K-12 MG1655 genome, including transcriptional regulation, transporters, and metabolic pathways.
- HumanCyc: Database created from computational pathway analysis followed by literature-based curation of pathways and enzymes.
- MetaCyc: Contains 2,400 metabolic pathways and 13,000 biochemical reactions manually curated from the published literature.
Sophisticated search capabilities exist. Examples of some of the queries that can be performed:
- Accession number
- Blast either a protein or a nucleotide sequence
- Cellular location
- Chemical formula (partial or full)
- Compound name or ChEBI, LIGAND, PubChem, or CAS identifier
- Cross-organismal search
- Gene ontology terms
- Gene, protein, or RNA name
- InChI (IUPAC International Chemical Identifier) string or key
- Pathway reactants or products
- Protein properties (pI, molecular weight, subcellular location, ligand)
- Protein features (active sites, calcium-binding regions, DNA-binding regions, repeats, alpha helix/beta strand/coiled-coil regions), transmembrane regions, signal sequences, sequence variants
- Replicon position
- Monoisotopic mass (mass spectroscopy)
- Small molecule regulator, cofactor, substrate, or ligand
A detailed Search Guide is available.
Data visualization and analyses
BioCyc offers an extensive set of tools for data visualization and analysis.
The SmartTables feature permits researchers to analyze results through the collection of genes or pathways, along with associated data, into a tabular form. Once created, SmartTables allow various analyses of results, “painting” of genes into a metabolic map, or SmartTables “transformation”. Examples of SmartTables transformation included transforming a genes SmartTable to a promotor or transcription binding site SmartTable, or a pathway SmartTable to a metabolic substrates SmartTable.
Data visualization tools include a genome browser, cellular overviews (metabolic map), metabolic modeling, metabolic route search, pathway collages (creating user-specified pathways), and Regulatory Overview, which is a network analyses tool representing genes as nodes and regulatory relationships as arrows or arcs.
Each organismal resource within BioCyc has its own creation date and update cycle. Below are the creation dates and last curation date for the most commonly used resources. Go here to learn how to receive update notification for any organism of your choosing.
|DATABASE||DATE CREATED||LAST UPDATED|
Continuously-->Linked to updates at Mouse Genome Informatics