

The development of GeneKeyDB is motivated by a desire to have a smaller-sized database that could tightly interoperate with different local computational tools and local data sets. Still BioMART requires a human intervention to retrieve the customized database through a web interface, where GeneKeyDB can be updated entirely through scripts. While this approach is very powerful, not all data present in GeneKeyDB is available from BioMART sources (for example CGAP expression data or Homologene). It extracts and integrates data from several sources, creating customized database. This is not a database in the conventional sense (though the underlying data can also be downloaded).

A schematic comparison of these databases and GeneKeyDB can be seen in Table Table1 1.ġRefers only to machine access to the relational databases.ĢGeneKeyDB serves as a data mining environment to different tools, therefore these tools are could also be considered a part of the interface layer.Īn interesting alternative to the above mentioned databases is BioMART. We have developed GeneKeyDB, a relational database, in an attempt to address these issues. Another database that is to some extent similar with respect to the design is the DRAGON database. The database, however, is somewhat difficult to store locally due to its large size and complexity. EBI's EnsMART is a resource that permits a comparable manipulation of data about sets of genes and provides an API along with the UI. Both computational tools and advanced data mining environments need to use these APIs to access and manipulate large, diverse, and intersecting sets of data. In particular, APIs are needed for computers to process the sets of the genes and gene products that are found in these biological networks. Even though an API could use web interface or a flat file database, this would make the analysis tool unacceptably slow. While having excellent user interfaces (UIs), LocusLink does not provide robust application programming interfaces (APIs). LocusLink (soon to be replaced by Entrez Gene ) is an example of a resource that adapts the more suitable gene-centric view. A database organization around a genome sequence record, for example, might be ideal for the purpose of a genome analysis, while the analysis of biological networks would be better organized around genes and gene products. Existing databases and interfaces, such as those at EBI and NCBI, often use sequence records as the central organizing unit. As we move toward large-scale research into complex molecular and cellular networks, the research community will need to develop new interfaces to complex data sets.
