Biological Discovery and Biological Databases
The completion of the sequencing of the Human genome was achieved in April, 2003. This completion has had an outstanding effect on how biological and biomedical research is conducted. The sequencing has given us information on human sequence variation data, model organism sequence data, and information on gene structure and function which all provide ground for the researchers to better design and interpret their experiments, fulfilling the promise of bioinformatics in advancing and accelerating biological discovery.
GenBank is the database in which most researchers are familiar with. GenBank is the annotated collection of all publicly available DNA and protein sequences. This database, maintained by National Center for Biotechnology Information (NCBI) at the National Institutes of Health, represents a collaborative effort between NCBI, the European Molecular Biology Laboratory (EMBL), and the DNA Data Bank of Japan (DDBJ).
The Human Genome Project along with other sequencing projects has allowed for a vast number of sequence data. For example the number of bases in GenBank doubles every 14 months, and this exponential growth rate is expected to continue for some time to come.
GenBank, or any other biological database for that matter, serves little purpose unless the data can be easily searched and entries retrieved in a usable, meaningful format. Otherwise, sequencing efforts have no useful end, since the biological community as a whole cannot make use of the information hidden within these millions of bases and amino acids.
Be that as it may, the range of publicly available biological data goes far beyond what is included in GenBank. Since the major public sequence databases need to be able to store data in a generalized fashion, often times these databases do not contain more specialized types of information that would be of interest to specific segments within the biological community. To address this, many smaller, specialized databases have emerged. These databases, which contain information ranging from strain crosses to gene expression data, provide a valuable adjunct to the more visible public sequence databases, and the user is encouraged to make intelligent use of both types of databases in their searches.
Biological databases provide a useful and informative role in the Biology community, providing the first step in being able to perform vigorous and accurate bioinformatic analyses. display_block('bio_databases'); ?>