Googling the library.

I’ve been working with RLG for the past couple of months, writing and editing content for their web site and print publications. RLG is a consortium of research libraries, archives, museum libraries, and the like. One of the things I like about this gig has been that I get to learn about the high-tech information management tools that lurk in the background of these big libraries.

RLG is trying to take some of the library world’s background technology and bring it to the fore with a new Web application (still in development) dubbed RedLightGreen. If it works, it could do for the library stacks what Google did for the Web.

Here’s an article I just wrote for RLG about the project:

RLG’s RedLightGreen Project: Mining the Catalog

RLG’s 23-year-old Union Catalog encompasses more than 126 million bibliographic records, representing 42 million unique titles. It provides unparalleled coverage across subjects and material types in more than 370 languages, from hundreds of libraries worldwide.

RedLightGreen can use this data to put the most widely held items near the top of any search results list–helping users to zero in on the most credible books and authors quickly. If a book appears in dozens of libraries’ collections, it’s a good bet that the book is considered an important source of information in its subject area: its selection by dozens of librarians is an implicit endorsement. By contrast, an item held by only one library may be of interest to Ph.D. candidates and specialists, but is probably less interesting to a general audience.

The clever insight of Google founders Sergey Brin and Larry Page is that links from one page to another are implicit endorsements of the linked pages. If you can collect that link data, as Google does, you can infer which pages the Web publishing community finds most valuable, and use that information to improve search results.

Similarly, RedLightGreen will — if it works — take holdings data from library catalogs and use that information to identify the books deemed most valuable by the library community. Pretty cool stuff.