Wednesday, April 2, 2008

Reading Natural Science Papers: A Method I Swear By

Photo by: a trying youth

Those of who enter grad school in the natural sciences inevitably encounter frustrations with "reading papers." They either seem totally incomprehensible or take forever to get through. When the stars are particularly misaligned, both happen. I've struggled with this for a long time and through talking with my adviser and others' have learned that there is a general system for getting through papers without wasting an entire day, which I outline here in 3 simple points. This strategy has gone through some rigorous training so I'm not going to make excuses for it and say "I hope it works for you." I think it's a good, solid method and should work for just about anyone doing research in the natural sciences. At the same time, I realize the process is one of learning and revising, so I'd love to hear your tips on this subject.

A Caveat for the Young

But first a caveat for those that are just starting to read papers, often those that just started grad school or started research with a group in the late undergrad years. Starting to read science papers sucks at the beginning and there's really no way around it but to keep at it. At the beginning you understand barely anything. There's too much jargon, there may be all kinds of equations you don't understand, you're not clear on what is obvious and what is new in the field, you're not clear on what is important and what is a detail, etc. This is normal. Follow these tips anyways, try your best, pay special attention to the "guess the figures" tip, and keep at it. It will start to make sense sooner than you'd expect.

2 Categories of Papers.

1. Know exactly why you're reading the paper. In my experience, papers in your subject fall into 2 categories: 1) Papers from which you want to extract certain information. 2) Papers that are extremely relevant that you need to read in detail. You read every word of the second type of paper and not every word of the first type. Not heeding the later part of that warning is where most of my time has been wasted. Most papers in your field fall into the first category. And by most, I mean almost all. That makes sense, because if there were a massive number of papers that were so close to what you are doing that you need to read every word of them, you would be getting scooped right and left (scooped = someone else publishing what you're working on before you do). A good strategy for deciding which category a given paper is in is to assume that it's in category 1 and move on. If, as you read, you realize it's very relevant, you will naturally start reading more and more of it and it will fall into category 2. There are only very few papers that fall into category 2. When they do, you know, because you are usually scared shitless that you just got scooped or very excited that you just found out some new, very relevant, information that could really help your project. Category 2 papers are most often ones that you read multiple times.

2. Do the following in order: read the abstract, stop and make a guess as to what the figures will be, look at the figures and captions, read the conclusion, read the introduction if necessary. Minus a few instances where I get bogged down on figures that are interesting for one reason or another, this usually takes no longer than 5 minutes. This is where you get the gist of the paper. The abstract of course should have been read before even downloading the whole paper. The guessing figures step is an extremely handy trick that keeps you engaged. You think "If I wrote a paper with this abstract, I would probably need the following figures to support my claims." Once you do this, each figure becomes a "Yup, of course" or "Hmm, I didn't think of that. Let me think why this is important," and you're 10 times more engaged than if you had kept reading mindlessly. The conclusion is often even more concise than the abstract in stating what about their paper is new and important, and finally, the introduction, should be read if you feel you want more of a sense of context for this work. If you know how the work fits into the larger picture, the introduction is not necessary.

If you feel you've found what you wanted from the paper or what you want is definitely not there, stop here.

3. Move on to discussions of the figures within the text. Know what the most relevant figures to you are, search for where they begin discussing those figures (pdf searching "Fig. 3" or something equivalent works well), and read those sections.

This is all you need for Category 1 papers. If you need to read more, it's probably a Category 2 and it should be obvious by this point. Note there was no mention of the experimental setup and details. These should be skimmed if read at all. The only reason you need to know of these details are when you are comparing your exact experiment with theirs or are working off of their experimental setup, in which case, the paper is clearly a Category 2 and you'll be reading the whole paper probably more than once.

I read Category 2 papers using the same steps above, only after step 3 I start at the introduction and read through the whole thing, skimming parts I just read in detail and understood easily and slowing down at the more interesting sections.

That's it.

What to avoid: Equations, Details, and Reference Chasing

Don't get bogged down in equations. If you really need to know them, you'll get back to their derivations later. The worst thing you could do is start reading the paper from the beginning, word for word, get to equation 1, realize you don't know the derivation, see that it comes from reference 7, look up reference 7, start doing the same thing with that paper and slip down a cycle of never having completed an entire paper because you stop and look at a reference every time you don't understand the tiniest detail. Details are important in science of course, but there is simply too much out there that you don't know to be chasing all the details and all the derivations down the very first time you run into them in scientific literature.

Reading More Papers

I've been using this method (category, abstract, guess figures, figures, conclusion) for months now and it's absolutely dandy. I find I'm exposing myself to more papers now because I know I can get the gist of the paper without wasting an hour. I also don't feel intimidated by not understanding every detail. I used to have this false idea that professors and other experienced scientists understand everything about the papers they read. That's nonsense, of course they don't, that's why papers continue to get published, they're new. I find myself now browsing through Nature and Science magazines' websites expanding my base without feeling stupid about not knowing the details and not wasting too much time. I'm keeping up on the journals in my field a lot more than I used to. And I'm saving massive amounts of time doing literature searches to find information on a specific idea or subject.

What strategies do you use?


Liz said...

Excellent suggestions! I wish they had been around a few months ago...

Bdizzy said...

why, just graduated?!


Liz said...

No.. I just started a masters/phd in a field sort of loosely related to my undergrad in September. I've been struggling through reading since then (showing a lot of improvement, but it's still slow). Now I'm writing a big lit review (think first chapter of a dissertation), and I have pathetically few sources.

It was a great article though, It'll definitely help to streamline my reading in the future. Recognizing that I don't have to understand every little thing is going to speed me up a lot.