Tuesday, July 10, 2007

Tips (rules?) for Open Notebook Science

I recently decided to give open notebook science a try. In order for my lab notebook to be useful to others, I've gotta to put a little extra effort into making my notebook more understandable to outsiders. I think a lab notebook will never and perhaps should never be as easy to understand as a paper, since you want to spend most of your time doing science rather than making beautiful figures and writing stunning introductions. I would simply like to reach the point where someone in a similar field to me could pick up my notebook and understand it without too much effort.

I'm trying to catalog some basic ideas that would promote better open notebooks, with better defined as:
  1. searchable
  2. understandable
  3. dependable (i.e. small software failures won't forever zap all of your results)

Here's what I've come up with so far. Please comments if you think of other tips.

  • use some sort of version control system (wiki, cvs, subversion)
    • this is particularly important if you have an electronic only lab notebook as it creates a time stamp for everything you enter into the notebook, which would be important for patents and other legal stuff
    • it also allows you to go back and look at previous versions
  • backup your notebook
    • with cvs or subversion back up your repository
    • with wiki's this becomes wiki specific, so check the documentation for your wiki
  • organize hierarchically
    • break the notebook into sections
    • break the sections into subsections
    • remember to include a time stamp in the text of your notebook at the beginning of each new experiment you do and at the beginning of each section you start
  • introduce every section giving the bigger picture (not too long, just a paragraph or so on the big idea); a nice figure would be useful too since many scientists prefer skimming figures to skimming text
  • if a section is complete or dead (i.e. you've abandoned the project), state so very prominently at the start of the section. If the work was published, provide a reference. If the work was abandoned, perhaps explain why.
    • also if a section hasn't been touched for a long while, you might add something like "This chapter is not being actively worked on"
  • link to raw data when and where you mention it in your notebook
  • remember the notebook is public, so be careful not to say stuff that might offend sensitive ears or sensitive scientists
  • include high quality images in your documents; things like agarose gels will need to be zoomed in a lot to be inspected in detail; if you convert your full resolution tiff to low-quality jpeg, it'll just look like pixelated blah. Then again, you can't always use full-size images, particularly from a high megapixels camera, because the notebook will quickly become giant; so here is my suggestion:
    • if the image is small (<1mb)>
    • if it is huge but detail doesn't matter, include a decent resolution image that can be zoomed in 2-4x and still look nice
    • if it is huge and detail matters, include a decent resolution image, but also include a link to the full size image like you would for other raw data
  • construct the document in such a way that it is easily indexed by search engines (otherwise no one will find your results; people probably wont read your lab notebook for fun)
    • the above statement difficult to comply with if you use pdfs because Google currently only indexes the first few hundred kbytes of a pdf; my lab manual is 30MB

please let me know if you have any ideas or suggestions about these rules.


Jean-Claude Bradley said...

I definitely understand the concern about making your notebook understandable to others. This is what we have evolved after 2 years of being open:

1) Use a wiki as the actual notebook with one experiment per page. Wikispaces has several advantages - free and hosted, RSS feeds, zip backups, automatic share-alike with attribution CC license, third party time stamps on all versions of the page. Link to raw data here.

2) Use a blog to discuss milestones and key problems to a more general scientific audience and link back to the experiments in the wiki. You can also link from the experiment pages to a blog post that summarizes the purpose of the research. That way you don't have to constantly worry about your audience in the notebook. You can also use wiki pages to organize experiments but using the blog has the advantage of updating your RSS subscribers easily.

3) Use a mailing list for external collaborations. Unless you work face-to-face with people on a regular basis, it may be difficult to use a wiki. Also there is a bit of a learning curve with wikis and people usually feel more comforable with email.

Now organic chemistry may be a little different from your area so I'll be interested to see what you find to be useful.

I've also described how this all evolved in a recent talk at the American Chemical Society meeting.

Pedro Beltrão said...

It is really cool that you are doing this. There are some pages in the Nodalpoint wiki that were recently started exactly to note down and discuss some of these issues. Feel free participate there with these ideas.


Anonymous said...

What software do u write your notebook in? are using a plain text editor for write the latex or some other a WYSIWYG editor that exports to latex and pdf?
Would be handy to know, thinking of using something simular to keep my notes. (papers tend to get lost sometimes, and electronic is easier to share)

J said...

I discuss a little bit how I make the notebook on the notebook's website.

I also provide a link to an example on that page.

I use TexShop on a Mac to compile the document, and I edit the document in the vim text editor.

ChemSpiderMan said...

I have been collaborating with Jean-Claude Bradley in regards to providing access to molecules in ChemSpider and linking backwards to the wiki and blogs (this is now being made available through structure deposition online)

We have recently enabled image generation with embedded InChI strings and keys. http://www.chemspider.com/blog/?p=161

Is there anything I can do to help in your efforts to expand Open Notebook Science? Specifically in regards toi hosting your molecules and analytical data. http://www.chemspider.com/docs/Uploading_Spectra_onto_ChemSpider.htm

Best wishes.

J said...

thanks for the links chemspiderman. They aren't coming out right in your comment so first let me add them:

image generation with embedded InChI strings and keys

uploading spectra into chemspider

However, I don't do any chemistry, so I have no molecules that I need hosted. I like the image information embedding trick. Does that get picked up by search engines?