Wednesday, March 19, 2008

The Biosciences in Google's Summer of Code

The Google Summer of Code project participants have been selected. I scanned the list to see how projects specifically aimed at the biosciences and bioinformatics fared:

  • GenMAPP (Gene Map Annotator and Pathway Profiler), a tool for visualizing gene expression data on top of graphical representations of biological pathways.
  • The NESCent (National Evolutionary Synthesis Centre) Phyloinformatics project, has range of potential projects to do with phylogenetic analysis, covering things like phyloXML integration with BioPerl and BioRuby, phyloinformatics web services and tree analysis using the MapReduce algorithm (with Hadoop).
  • OMII-UK, which covers a range of tools including the Taverna Workbench for workflow design and execution.
  • Also participating is OpenMRS, a medical record system aimed at developing countries.
There are also at least two platforms for cluster, parallel or grid computing on the list; I spotted the Globus Toolkit and OAR, but there are probably a few more in that that broad category (eg, OMII-UK oversees a bunch of Grid related projects too).

It's worth noting that I've ignored a bunch of really important pieces of software that are less field-specific, but are actually lower level components of the platforms critical for most large bioinformatics projects. Things like Python, Perl, R, various Open Source databases, and collaboration tools like wikis (MoinMoin) and CMSs (eg Drupal) are also participating.

I don't think coding for bioinformatics applications is as attractive to students as working on some of the other "sexier" projects available (eg the SecondLife client, or the Apache Webserver), but kudos to Google for letting a few bioinformatics tools into the fray. Hopefully the students who hack on them learn something, and hone their coding skills (you never know, they may even help improve these tools too :) ).

1 comment:

Ani said...

Nice to see google promoting such projects. Given google's expertise, AI and NLP based biological textmining (hell lot of free text there!) would have been real cool as well.