The news about the RHS Bibliography of British and Irish history has unsurprisingly provoked considerable discussion and criticism. I want to follow up my last post with a few comments.
As some have already pointed out, basically the reason this is happening is because the funding structure for online resources in the UK (I don’t know about anywhere else) does not take into account the resources needed to continue to maintain online material in the long term. Even if you never update or add to your resource after publication, you have to pay for hosting. Servers fall over from time to time and need human intervention to get them restarted. Databases can get mysteriously corrupted and need rescuing (you have to keep backups as well). You have to keep your database secure from the legions of spammers and vandals and their bots (may they rot in hell).
A bibliography, however, does also need to be regularly updated. And that’s only one problem.
Yes, technically you can scrape the RHS bibliography, extract all its data and re-publish it somewhere. (Bill Turkel has already provided instructions; it’s a doddle.) If you do that you’ll be breaking the Terms & Conditions and infringing copyright. You can try it if you want, but the new owners aren’t going to like it, and they’ll have more money than you for lawsuits. Do you want to take them on?
And I’m going to say this flat-out, without equivocation: there is no way that you could build an equivalent source from scratch using Web2.0 methods. I’m extremely doubtful that you could even keep it properly updated that way. Because we’re running right up against the limitations and weaknesses of Web2.0 and crowdsourcing here.
A major part of the value of the RHS bibliography is that it aims, however imperfectly, (a) to be comprehensive and (b) to use structured, systematic classifications. It’s not just a keyword search.
Now, my own recent experience with wikis is that people are pretty good at providing content but largely terrible at doing structure and order. And those are vital for an online bibliography.
Bibliographies are very complicated structurally. (This is why there aren’t that many web bibliography applications out there…) There are so many different types of publication you have to take into account: even the most basic – books (authored and edited), journal articles and book chapters – necessitate a pretty complex database structure. Take a look at the array of BibTex formats.
(I’ve created online bibliographies using specialised bibliography tools and customised mediawiki plugins. It’s not easy. Actually, it’s time-consuming and bloody hard work. I enjoy it, but I’m weird that way.)
Web2.0, crowdsourcing, folksonomic tagging, can do a lot of things. But it’s all kind of haphazard and serendipitous. Dan Cohen and Roy Rosenzweig warned us, in the context of collecting primary sources online but it also applies here:
Collections created on the web through the submissions of scattered (and occasionally anonymous) contributors do have a very different character from traditional archives, for which provenance and selection criteria assume a greater role. Online collections tend to be less organized and more capricious in what they cover.
A capricious, disorganised bibliography is not very useful to scholars.
* * *
Well, that’s the pessimistic post. I’ll try to do a slightly more constructive practical one later with some ideas and resources…