
AmiGO is the ontological browser for Gene Ontology.
http://www.godatabase.org
|
Gene Ontology Lecture
Gene Ontology: Hands-on Annotation Workshop
|

ArrayExpress is a public repository for microarray data, which is aimed at storing well annotated data in accordance with MGED recommendations.
http://www.ebi.ac.uk/arrayexpress/
|
ArrayExpress Lecture
ArrayExpress Hands-On Workshop |

BioMart is a federated Open Source search tool which rapidly permits queries against large volumes of biological data. It has been designed to provide researchers with an easy and interactive access to both the wealth of data available on the Internet and for in house data integration.
http://www.ebi.ac.uk/biomart
|
Introduction to BioMart Lecture
Using BioMart Hands-On Workshop
|

Sequence alignments provide a powerful way to compare novel sequences with previously characterized genes. Both functional and evolutionary information can be inferred from well designed queries and alignments. BLAST 2.0, (Basic Local Alignment Search Tool), provides a method for rapid searching of nucleotide and protein databases.
http://www.ncbi.nlm.nih.gov/BLAST/
|
NCBI Field Guide
BLAST QuickStart Lecture and Hands-on Workshop
Making Sense of DNA and Protein Sequences Lecture and Hands-On Workshop
Structural Analysis QuickStart and Hands-On Workshop
BLAST—Beyond Point and Click,
an advanced Hands-On Workshop
|
Cn3D is a helper application for your web browser that allows you to view 3-dimensional structures from NCBI's Entrez retrieval service. Cn3D runs on Windows, Macintosh, and Unix. Cn3D simultaneously displays structure, sequence, and alignment, and now has powerful annotation and alignment editing features.
http://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml
|
NCBI Field Guide
Making Sense of DNA and Protein Sequences Lecture and Hands-On Workshop
Structural Analysis QuickStart and Hands-On Workshop
|

Proteins often contain several modules or domains, each with a distinct evolutionary origin and function. NCBI's Conserved Domain Database is a collection of multiple sequence alignments for ancient domains and full-length proteins.
http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml
|
NCBI Field Guide
Making Sense of DNA and Protein Sequences Lecture and Hands-On Workshop
Structural Analysis QuickStart Lecture and Hands-On Workshop
|

EMBOSS is an open source software suite of over 200 applications for the in silico analysis of biological problems ranging from nucleic acid and protein sequence analysis through the creation and indexing of your own data.
http://emboss.sourceforge.net
|
Introduction to EMBOSS Lecture
Programming with EMBOSS Workshop
|

Ensembl is a eukaryotic comparative genome browser resource. Currently available are 14 completed eukaryotic genomes. Also available are the genomic resources for the Cow and Opossum which are currently being assembled. Both assembled and pre-assembly databases are available for the latter two organisms.
http://www.ensembl.org
|
Ensembl Lecture
Ensembl Hands-On Workshop |

The NCBI Entrez portal provides integrated access to nucleotide and protein sequence data from >130,000 organisms, along with 3D protein structures, genomic mapping information, PubMed MEDLINE, and more. Sequence data are combined from various sources, including GenBank, EMBL, DDBJ, RefSeq, PIR-International, PRF, Swiss-Prot, and PDB. Entrez can be searched with a wide variety of text terms such as author name, journal name, gene or protein name, organism, unique identifier (e.g., accession number, sequence ID, PubMed ID), and other terms, depending on the database being searched.
http://www.ncbi.nlm.nih.gov/Entrez/
|
|

Expression Profiler is a newly created Open Source resource is a web-based environment for the analysis of, mainly, two types of data: gene expression and sequences. The system is designed to be extensible to other data types - currently protein-protein interaction (PPI) data support is being added.
http://www.ebi.ac.uk/expressionprofiler/
|
ArrayExpress Lecture
ArrayExpress Hands-On Workshop |
|
GenBank ® is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences.
http://www.ncbi.nlm.nih.gov/Genbank/
|
NCBI Field Guide
Making Sense of DNA and Protein Sequences Lecture and Workshop
|

Recently replacing LocusLink, GENE is a major access point to NCBI’s databases and sequence information, along with MapViewer (chromosome-related access point) and Entrez. (text-based access). Gene, as it name implies, provides a gene-based view of the data from a wide range of genomes. It supplies key connections in the nexus of map, sequence, expression, structure, functional, and homology data. Each record represents a single gene from a given organism. The minimum set of data in a gene record includes a unique identifier or GeneID assigned by NCBI, a preferred symbol, and any of sequence information, map information, or official nomenclature from an authority list. In addition, a gene record can also include expression, structure, functional, and homology data, when available.
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene
|
NCBI Field Guide
Entrez GENE QuickStart Lecture and Workshop
|

Gene 3D is a database member of EBI’s InterProScan consolidated proteomic resources which will be covered in the Proteomics Workshop/lecture series during Bioinformatics Week. It contains over 850,000 protein sequences from completed genomes, clustered into protein families and annotated with CATH domains.
http://www.biochem.ucl.ac.uk/bsm/cath/Gene3D/
|
Proteomics Lecture
Proteomics Hands-On Workshop
(through InterPro)
|

NCBI provides several genomic biology tools and resources, including organism-specific pages that include links to many web sites and databases relevant to that species, including incomplete genome assembly projects. Completed genomes are available for through resource-specific portals, including for Plant Genome Central, virus, microbial resources, plasmids, organelles, SARS, and Influenza to name a few.
http://www.ncbi.nlm.nih.gov/Genomes/
|
NCBI Field Guide |

Gene Ontology is a controlled vocabulary that can be applied to all organisms even as the knowledge of gene and protein roles is changing.
http://www.geneontology.org
|
Gene Ontology Lecture
Gene Ontology: Hands-on Annotation Workshop
|

IntAct provides a freely available, open source database system and analysis tools for protein interaction data. All interactions are derived from literature curation or direct user submissions and are freely available.
http://www.ebi.ac.uk/intact/index.jsp
|
Proteomics Lecture
Proteomics Hands-On Workshop
(through InterPro)
|

InterPro is a consolidated database of protein families, domains and functional sites in which identifiable features found in known proteins can be applied to unknown protein sequences.
http://www.ebi.ac.uk/interpro/index.html
|
Proteomics Lecture
Proteomics Hands-On Workshop
|

NCBI provides different ways for scientists to access and view sequence-related information. The Map Viewer is a powerful graphical interface which supports search and display of genomic information and expression by chromosomal position. Regions of interest can be retrieved by text queries (e.g. gene or marker name) or by sequence alignment (BLAST). View results at the whole genome level, and select what to display in more detail. Multiple options exist to configure your display, download data, navigate to related data, and analyze supporting information using the tools provided.
http://www.ncbi.nlm.nih.gov/mapview/
|
NCBI Field Guide |
|
NCBI's structure database is called MMDB (Molecular Modeling DataBase), and it is a subset of three-dimensional structures obtained from the Protein Data Bank. It was designed for flexibility, and as such, is capable of archiving conventional structural data as well as future descriptions of biomolecules, such as those generated by electron microscopy (surface models).
http://www.ncbi.nlm.nih.gov/Structure/MMDB/mmdb.shtml
|
NCBI Field Guide |

The EBI Macromolecular Structure Database is an European project for the collection, management and distribution of data about macromolecular structures, derived in part from the Protein Data Bank (PDB).
http://www.ebi.ac.uk/msd/index.html
|
Proteomics Lecture
Proteomics Hands-On Workshop
(through InterPro) |

The PANTHER ( Protein ANalysis THrough Evolutionary Relationships) Classification System was designed to classify proteins (and their genes) modeled on the divergence of function.
https://panther.appliedbiosystems.com/
|
Proteomics Lecture
Proteomics Hands-On Workshop
(through InterPro) |

Pfam is a collection of protein families and domains. Pfam contains multiple protein alignments of these families. Pfam is a semi-automatic protein family database, which aims to be comprehensive as well as accurate.
http://www.sanger.ac.uk/Software/Pfam/index.shtml
|
Proteomics Lecture
Proteomics Hands-On Workshop
(through InterPro)
|

The Protein Information Resource SuperFamily (PIRSF) is a classification system based on evolutionary relationship of whole proteins
http://pir.georgetown.edu/pirsf/
|
Proteomics Lecture
Proteomics Hands-On Workshop
(through InterPro) |

PRINTS is a compendium of protein fingerprints. A fingerprint is a group of conserved motifs used to characterize a protein family. Fingerprints can encode protein folds and functionalities more flexibly and powerfully than can single motifs, full diagnostic potency deriving from the mutual context provided by motif neighbors.
http://umber.sbs.man.ac.uk/dbbrowser/PRINTS/
|
Proteomics Lecture
Proteomics Hands-On Workshop
(through InterPro) |

ProDom is a comprehensive set of protein domain families automatically generated from the SWISS-PROT and TrEMBL sequence databases.
http://prodes.toulouse.inra.fr/prodom/current/html/home.php
|
Proteomics Lecture
Proteomics Hands-On Workshop
(through InterPro) |

Prosite is a database of protein families and domains. It consists of biologically significant sites, patterns and profiles that help to reliably identify to which known protein family (if any) a new sequence belongs.
http://us.expasy.org/prosite/
|
Proteomics Lecture
Proteomics Hands-On Workshop
(through InterPro) |

The Reference Sequence (RefSeq) collection aims to provide a comprehensive, integrated, non-redundant set of sequences, including genomic DNA, transcript (RNA), and protein products, for major research organisms.
http://www.ncbi.nlm.nih.gov/RefSeq/
|
NCBI Field Guide |

SCOP, a Structural Classification Of Proteins, is a database which aims to provide a provides a broad survey of all known protein folds to facilitate a comprehensive description of the structural and evolutionary relationships between all proteins whose structure is known.
http://scop.mrc-lmb.cam.ac.uk/scop/
|
Proteomics Lecture
Proteomics Hands-On Workshop
(through InterPro) |

SRS, the Sequence Retrieval System, is a comprehensive web-based, cross-database search interface to more than 150 resources covering information related to protein and nucleic sequence information. This includes the biological, clinical and patent literature, sequence and mutation databanks, biological resource catalogues holding cell line information, metabolic pathways and more. For a complete listing of searchable databanks available through SRS which will be part of the lecture/workshop offered during Bioinformatics Week, see here (http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-page+top+-newId)
http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-page+srsq2+-noSession
|
Sequence Retrieval System (SRS) Lecture
Sequence Retrieval System (SRS) Hands-On Workshop |

SNP stands for "single nucleotide polymorphism". A key aspect of research in genetics is associating sequence variations with heritable phenotypes. The most common variations are single nucleotide polymorphisms (SNPs), which occur approximately once every 100 to 300 bases. In collaboration with the National Human Genome Research Institute, The National Center for Biotechnology Information has established the dbSNP database to serve as a central repository for both single base nucleotide subsitutions and short deletion and insertion polymorphisms.
http://www.ncbi.nlm.nih.gov/SNP/
|
NCBI Field Guide |

SMART (Simple Modular Architecture Research Tool) is a web tool for the identification and annotation of protein domains. It provides a platform for the comparative study of complex domain architectures in genes and proteins.
http://smart.embl-heidelberg.de/
|
Proteomics Lecture
Proteomics Hands-On Workshop
(through InterPro) |

The purpose of Superfamily is to provide structural (and hence implied functional) assignments to protein sequences at the superfamily level. A superfamily contains all proteins for which there is structural evidence of a common evolutionary ancestor.
http://supfam.mrc-lmb.cam.ac.uk/SUPERFAMILY/
|
Proteomics Lecture
Proteomics Hands-On Workshop
(through InterPro) |

TIGRFAMs are protein families based on Hidden Markov Models or HMMs, created by TIGR, The Institute for Genomic Research.
http://www.tigr.org/TIGRFAMs/index.shtml
|
Proteomics Lecture
Proteomics Hands-On Workshop
(through InterPro) |

UniGene is an experimental system for automatically partitioning GenBank sequences into a non-redundant set of gene-oriented clusters. Each UniGene cluster contains sequences that represent a unique gene, as well as related information such as the tissue types in which the gene has been expressed and map location.
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=unigene
|
NCBI Field Guide |

UniProt (Universal Protein Resource) is the world's most comprehensive catalogue of information on proteins. It is a central repository of protein sequence and function.. Created by merging the data in Swiss-Prot, TrEMBL and PIR-PSD, individual UniProt Knowledgebase entries may contain more information than was available in any given separate source database.
http://www.ebi.ac.uk/uniprot/
|
Proteomics Lecture
Proteomics Hands-On Workshop
(through InterPro) |