Monday, March 26, 2007


Deep within Cornell University's Albert R. Mann Library website, I ran across an interesting digital book collection called The Hive and the Honeybee, which comprises 30 digitised books from the E.F. Phillips Beekeeping Collection. The thirty books selected by scholars date from 1807 - 1917, but the digitised portion of the collection grows as funds become available.



To the left of each page is a clear menu including Main, About, Search, Brouse, Contact, Help. The help includes a list of FAQs, and also lists hints for Gerneral use, searching, browsing, and other topics. In essence, this collection could be used as a library teaching tool because it provides step-by-step explanations of searching through Boolean, Proximity or Bibliographic searches. It also describes how and in what format documents or entire books can be saved and printed as PDFs. In this help section, we eventually discover that the Hive and Honeybee materials were encoded in simple SGML with a 40-element DTD that conforms to TEI Guidlines. Books can be viewed either as page pics or as OCR text, and either in single page or the entire book.


Each of the 30 books was run through OCR prior to posting, so we can search the text of each book and also see the full text alongside the page shots. That's great, except that no-one seems to have gone back to correct the OCR, as shown by the fact that the tall s and ct ligatures were not read correctly. Here's an example of what the OCR did with the 1807 book: "But although many inter*-efting peculiarities have been difcovered, they are fo much interwoven' with errors, that no fubjed has given birth to more ab-furdities." The transcription of the word subject as "fubjed" really does look "abfurd".

Navigation within a book is a little slow to load, but is fairly good. A drop-down menu of pages allows you to select a page, and it even notes which ones are illustrated. Metadata accompanying each book that can be seen by the user is limited to Author, Title, Publication Info, Print source, Subject terms, and the URL. I see two problems with this: 1. Publication Info is invariably just the name of the Mann Library, without any further call number or shelf location information for citation. 2. There is no indication of how long a book is to help the user determine how or whether to download it. The only way to determine this is to open a book and scroll down the page reference drop-down box, which can take a long time for books over 50 pages. As for viewing, the images of pages can be scaled up twice, which for printed books is usually sufficient but leaves a bit wanting for illustrataions and photgraph. A rotation tool could also be useful for anyone who does not want to print or download images that are rotated 90 degrees to fit into the book (i.e. landscape prints).

Another thing I regret is that most covers and endleaves were not photographed in the process. Much of the interesting provenance information is located on those leaves. This project (or more likely the preservation microfilm project from which the images were obtained) clearly is interested more in the book as text than book as artifact and object of cultural history.


I find it interesting that external funds were raised for this project from interested apiculturalists across the country, beekeeping clubs, and as memorials in honour of bee lovers. This system of donation is similar to the way Phillips endowed the collection in 1925, when he asked each apiculturalist to set aside one hive to generate $50 in honey which was contributed to the book fund. All of the donors for the digital version of the collection are listed on the credits page, which can be linked to from the About page. Also listed in the credits page are the experts who selected the materials, 4 site designers and developers, and 7 project staff (bibliographer, public programs administrator, preservation librarian, preservation assistant, two metadata services people and a copyright specialist).

If you have time, it's an interesting collection to browse (by author, but with 30 books, it doesn't much matter). As for me, I'm off for a piece of toast with honey now.

No comments: