A brief history of federated & web-scale search in the library

From the Fall 2014 Newsletter

The Lloyd Sealy Library, with considerable help from CUNY, currently provides access to over 200 databases, multiple ebook collections, six streaming video collections, and over 80,000 online journals; not to mention the millions of books listed in the CUNY online library catalog (CUNY+). Students (and faculty, too) are understandably bewildered by this surfeit of riches. Give me a single search box, they cry, let me put some words in it, and out should come the books and articles that I need to write my paper.

Twenty years ago that was a ridiculous idea, but since the rise of the Google search engine, the general public has learned that they can, in fact, just put a term into a search box and pretty much all the time get the information they want. Why can’t this happen in the world of scholarly writing?

Actually, to a certain extent, it can. Library publishers and database vendors have been experimenting with this idea since the middle of the last decade. “Federated search” engines were developed, which let users enter search terms that were then turned into queries sent to multiple distinct databases at the same time. The Lloyd Sealy Library subscribed to such a federated search service beginning in 2009; we called it “Hound Hunt.” Federated search was slow and clunky; results were incomplete; there were duplicated results; and extra clicks were needed to finally get to the full text of the article. Hound Hunt lasted until summer 2013, but it never really took off in popularity at John Jay:

The Library community knew that a more “Google-like” experience was needed, and a number of library vendors developed “web-scale discovery services.” Instead of a search bot performing separate searches on multiple different databases, publishers and databases vendors agreed to contribute their metadata to huge merged indexes which could be searched quickly and painlessly via a “discovery” layer that then displays results in a user-friendly and intuitive manner, leading seamlessly to the full text of articles, books and even media. Anticipating that the CUNY libraries would be moving to a discovery service offered by the vendor of our online catalog, but knowing that this service was at least a year away, the Lloyd Sealy Library, with the help of Student Technology Fee funds, subscribed to the EBSCO Discovery Service (EDS) beginning in August 2013. Our users immediately found this search to be faster, easier and more rewarding, as the following usage figures reveal:

The EDS service did not search the books listed in CUNY+, unfortunately, performing searches only on the combined indexes of multiple article databases. But the CUNY Office of Library Services has now begun implementing the Primo Discovery Service, which can search multiple databases plus CUNY+ all at once. The CUNY implementation of Primo, named CUNY OneSearch, is now available in beta form on the Library website. Try experimenting with OneSearch (also available as a tab on the Library’s home page).

More information about OneSearch will be forthcoming on the Library website and in the Spring 2015 issue of Classified Information.

Bonnie Nelson

More from the Fall 2014 newsletter »