To Those Involved in Research!

A 'copy and paste' from the New York Times.

Defeating Bedlam

This week, I want to look at one of the unglamorous, but essential, parts of science: the problem of how to organize the information you have so that you know what you’ve got. For, like everything else in the digital age, the process of collecting and managing scientific information has been evolving. Fast.

Here’s what I used to do, way back, oh, seven years ago when I was writing a book about the sex lives of animals. When I wanted to do research on a topic, I would go to the university library — how quaint! — and photocopy the scientific papers I wanted to read. Papers such as “Homosexual rape and sexual selection in Acanthocephalan worms” from the journal Science. Or “Deformed sperm are probably not adaptive” from Animal Behaviour. If I was looking for something more obscure — say, “A review of tool use in insects” from Florida Entomologist — I sometimes had to go to a specialist library, like the one in London’s Natural History Museum.

Having collected the papers, I would take them back to my office, type the bibliographic details (authors, title, year published and so on) into my computer and put the photocopies into folders with other papers on the same general topic. In the case of the Acanthocephalan worms, it was a folder labeled “sabotage”; for the deformed sperm, it was “other sperm.” When the time came to write up my discoveries and thoughts on the subject of sperm evolution, or how males sabotage their rivals, I went to the relevant folder, read the papers, made notes on them and started writing.

As a system, it was a little clumsy — photocopying was a bore, and if I wanted to spend a couple of months writing somewhere other than my office, I had to take boxes of papers with me — but it worked. I knew what I had and where it was.

Then the scientific journals went digital. And my system collapsed.

On the good side, instead of hauling dusty volumes off shelves and standing over the photocopier, I sit comfortably in my office, downloading papers from journal Web sites.

On the bad side, this has produced informational bedlam.

The journal articles arrive with file names like 456330a.pdf or sd-article121.pdf. Keeping track of what these are, what I have, where I’ve put them, which other papers are related to them — hopeless. Attempting to replicate my old way of doing things, but on my computer — so, electronic versions of papers in electronic folders — didn’t work, I think because I couldn’t see what the papers actually were.

And so, absurdly, it became easier to re-research a subject each time I wanted to think about it, and to download the papers again. My hard drive has filled up with duplicates; my office, with stalagmites of paper. And it isn’t just that I have the organizational skills of a mosquito. Many of my colleagues have found the same thing. (Yes, we talk about it. Oh, they are lofty, the conversations in university common rooms.) In short, access to information is easier and faster than ever before (for a caveat, see the notes, below, but there’s been no obvious way to manage it once you’ve got it.

Several pieces of software are now being developed to address this problem. I want to look at two of them here. The first is called Zotero; the second, Papers. Both are in version 1 and are still a bit buggy; but each has the potential, I think, to become a valuable tool for research.

Zotero aims to let you build a library of useful books and articles that you encounter while surfing online. It’s an extension of the Web browser Firefox, and as you’d expect, it’s free to download and easy to install.

Once you’ve installed it, each time you visit a Web page that contains items — books, newspaper articles, soundtracks, films, etc. — with bibliographic information, it extracts that information and allows you to save it to your Zotero library if you want to.

So, suppose you’re interested in books about the psychology of war, and you go to Amazon and type “On Killing” into the search box. A list of books appears; Zotero collects the information for all of them and allows you to select the ones you want to keep. These are then put into your Zotero library. Once they’re there, you can make notes on them, put them into folders with other items that are related, and so on. If you ask it to, Zotero will see if it can find a given book in a local lending library. And, supposedly, you can also pull bibliographic information from Zotero into documents you’re writing, but I haven’t tried that part yet.

It’s a powerful piece of software with a lot of capabilities, though not all of them work as well as they could. For instance, it’s hit-or-miss with newspaper articles — sometimes it recognizes them, sometimes it doesn’t — and it can’t interpret information from, alas, my local lending library. It does, however, allow you to screen grab, so you can still collect such information if you want it. The screen grab also allows you to add interesting Web pages to your Zotero library. (This is different from storing the link to a Web site. The screen grab gives you the page as it was when you looked at it; clicking a link gives you a site as it is today.)

A minor quibble: if you use a small laptop, as I do, you may find the Zotero window occupies too much of the screen. But I shall certainly keep using it, though not, perhaps as its conceivers intended. For me, it’ll be a scrapbook of interesting stuff — books to buy later, press releases on subjects I think I might write about one day, magazine pieces about cities I’m thinking of visiting.

For the bulk of my researches, however, I shall use Papers. This software has been designed for the Macintosh by two avid fans who call themselves Mekentosj; it only works on the Macintosh platform. It’s not free, but it is quite cheap (20 pounds sterling; 40 U.S. dollars) and, for me, it’s been worth the money. For it solves the problem I started out describing — how to keep on top of scientific articles. How to know which ones you have, where they are, and what else you’ve got on the same subject.

The makers describe it as iTunes for .pdf files, and that’s broadly right. (For anyone who’s never encountered these things, a .pdf file is a type of document file that any computer can open using a free downloadable piece of software. This is the form electronic journal articles come in, and it means they look just as they would have done if you were reading the journal the old fashioned way. iTunes is a piece of music management software.) The idea is that, when you download an article, it goes into your Papers library. The bibliographic information immediately appears; so does, if you’re lucky, the “metadata” — like the abstract and the list of subjects that the authors thought their article touches on. (I say “if you’re lucky” because this doesn’t always happen automatically.) The document itself gets neatly filed in a folder on your hard drive, and renamed by authors and year. Gone are the days of 456330a.pdf and sd-article121.pdf. Hallelujah.

And that’s just the beginning. Not only can you read the papers, annotate them, find them and create folders of papers on related subjects, you can also use the software to search the big scientific databases like PubMed and the Web of Science. (Such databases are where you go to find out what’s already been published on the subject you’re interested in; it’s where most scientists find out about the papers they want to collect.) It doesn’t (yet) replace bibliographic software such as Endnote; but it can be used with it quite neatly.

Papers does have some teething problems. As I said, it’s still buggy, so not everything functions as it should. Moreover, the way it works is not always intuitive, and there’s no formal “help.” Instead, if you have a question, you have to wade through user forums to try to see if anyone else has had the same question before — and, more to the point, whether anyone has answered it. But after a couple of days of experimenting, I got it doing exactly what I need.

Organizing materials is always idiosyncratic. I have one friend who organizes the novels he owns by the year in which the books were published; another goes by the color of the spine. (The first accused the second of having the soul of an interior decorator.) But the important thing is not how you do it, but whether it works — whether you can find what you’re looking for. These bits of software open up possibilities; for some people they will be useful, for others they won’t. Some will use both, others neither. For me, well, a few days after discovering Papers, I put 20 sacks of real paper into the recycling bin. At last, I’m back to knowing what I have and where it is.

Bedlam has been defeated.

**********

NOTES:

One caveat. I say “access to information is easier and faster than ever before.” With respect to scientific information, this is true for people within universities, but not for those without them. One of the consequences of the scientific journals going digital is that it has become harder for members of the public to get access to original scientific information. It used to be the case, for example, that anyone could get permission to spend a day at the library at Imperial College; once there, they could read any of the journals on the library shelves. Now, subscriptions to the paper editions of many journals have been stopped — the journals are no longer physically there — and only members of the university are allowed access to the online versions. Some journals give free access, at least to back-issues; but many do not. Then, if you are not a member of a university and you want to read some articles, they may cost you as much as $30 each. I think this is a pity. Perhaps not many people want to read original scientific research; but somehow, it seems against the spirit of the enterprise.

In case anyone’s interested, here are the full details for the articles I refer to. For the worms, see Abele, L. G. and Gilchrist, S. 1977. “Homosexual rape and sexual selection in Acanthocephalan worms.” Science 197: 81-83. For deformed sperm, see Harcourt, A. H. 1989. “Deformed sperm are probably not adaptive.” Animal Behaviour 37: 863-865. For insects and tools, see Pierce, J. D. J. 1986. “A review of tool use in insects.” Florida Entomologist 69: 95-104.

Many thanks to Austin Burt, Gideon Lichfield and Daniel Simpson for insights, comments and suggestions.

*

Davids mom's blog | login to post comments