Category Archives: Databases

Creating a local copy of PubMed

I just updated a previous post about my plans to create a local copy of PubMed with a few more remarks about how I intend to proceed. I hope to have that work completed in the next two or three weeks. Check it out here: It will be a challenge to integrate the various data sources, including PubMed, our previously existing BibTeX database, and records in MARC format with one another. Semantic interoperability will be the greatest challenge.

BibTeX at the Darwin Manuscripts Project and BHL

Poking around the Biodiversity Heritage Library Tools page, I came across this question from the FAQ:

Question: What is the BibTex format that I see as a download option?

BibTex ( is a common format for citations/references and is supported by all the major software vendors (EndNote, RefWorks, Zotero, Biblio). This functionality that lets a user view & export a BibTex file for any title, including its items, from the bibliography page, as in this example:

BHL is also going to make this format available for download alongside our custom data exports, such that users can download a BibTex file
that contains 1) all the *titles* in BHL including links to each, and 2) all the *items* in BHL (each volume) along with links. We need this export to move title-level metadata from the BHL portal to the article repository, so thought we might as well make the file available for others to use.
In effect, this would put BHL titles & volumes in a format easily understood by existing reference management applications.

When deciding whether our big database of works about evolution at the Darwin Manuscripts Project would use Endnote or BibTeX managed by way of BibDesk, I opted for BibTeX—a smart decision, if I do say so myself. It’s served us well in the many years that we’ve been using it, and it looks like it will continue to be useful. Nelson Beebe is developing (or has completed development) on some scripts to represent BibTeX databases in my SQL tables. He provides some useful links to related software tools which are needed as adjuncts to his scripts. In a paper in TUG (forthcoming? in vol 30, issue 1, Nov 2009), he explains a little bit about BibTeX, relational databases, and what’s involved in representing a .bib file as a relational database.

If anyone out there’s had experience creating relational databases from .bib files, feel free to comment on this post, or to let me know how I can contact you to ask questions and listen to any tips, warnings, etc. you might have.