Every Friday afternoon, the students in our Ph.D. program are given the opportunity to present their research to the other students in the program. It’s a great chance to take stock of what you’ve been doing, as well as give your most avid fans a chance to critique your work in a comfortable, informal setting (before, say, presenting the work to the foremost experts in the field at a conference).
Now it doesn’t always work out in such an idealized fashion, but that’s another story. Rather, an interesting point came up at last Friday’s presentation: does an incredibly useful tool developed for researchers to conduct their research in a more successful fashion count, by itself, as research? Or is it “merely” software development?
Here’s the scene: one of the students – arguably one of the best, as he was the only second-year to propose and successfully defend a thesis idea – presented his ongoing work on a network evolution visualization tool that runs entirely on Flash/ActionScript. It’s a brilliant piece of work (check out the examples; you don’t even need to know what’s going on – I certainly haven’t a clue! – to appreciate how smoothly everything functions). After he demo’d the software and explained how it could be used to model changing biological networks (gene regulation, protein-protein interactions, and so on), he got the following question:
“So, this is a great software project…but where’s the research?”
It was a pointed question (as I mentioned before, these gatherings aren’t always exclusively flowers and unicorns), but it’s an extremely valid one, and it begs another one: what’s the difference between core research and tangential tool improvements? Where is the dividing line between the two?
The first-years and I discussed this issue for a little while after the presentation had ended. The example (my favorite example) I mentioned to them was that of the quintessential computational biology algorithm known as BLAST. BLAST is a paradigm-busting search engine which allows biological sequence data – proteins (amino acids) and DNA/RNA (nucleic acids) – to be searched in seconds, rather than days or weeks. It’s almost perfectly analogous to web searching: it would be very easy to write a search algorithm that crawls the entire Internet every time somebody typed a search phrase, with the downside being that you’d “only” have to wait a few weeks for the results to come back.
Such was the issue facing biologists before BLAST came along: searching, say, the human genome (over 3 billion nucleotides) for a specific sequence took forever. Or, better yet, searching the mouse genome for a sequence that was similar but not identical to a sequence in the human genome would take even longer than forever. It was the proverbial bottleneck in biological research.
BLAST is now the most-cited computational biology paper ever. Period. It’s been cited in over 33,000 other papers, including other seminal tools like the Protein Data Bank, GenBank, and many, many others. It’s resulted in an explosion of computational research, reducing the time needed for massive genomic and proteomic comparisons from weeks to mere seconds. It was huge.
And yet, at its very essence, it’s still just a tool. It didn’t add anything to the current understanding of biology; rather, it aided others in that pursuit. Such is the nature of the TVNViewer from last Friday. It doesn’t contribute to the standing pool of biological knowledge, but rather greases those very wheels for future research.
So the question that remains is this: does something that facilitates future breakthroughs itself qualify as a breakthrough? Does meta-research count as research? Google has certainly done quite well for itself by allowing others to find relevant information in an efficient fashion. Entire companies are devoted to making tools for programmers, from versioning applications to IDE plugins. Can we really say that such important tools aren’t themselves a form of research?
One could make the argument that, in a biological field, software development is somewhat unrelated. But since we’re in computational biology, I think that point becomes moot. There is indeed a spectrum – many of my classmates spend a great deal of their time in “wet” labs, whereas others have never run a single experiment that required a lab coat – but when push comes to shove, one can hardly argue against the utility of a mere “tool” that singlehandedly put computational biology on the scene.
Anyway. Food for thought on this dreary Monday.