26 June 2007

Interviews in SciView

In a recent post in NodalPoint, Paulo Nuin wrote a call for participation for his interviews in SciView. Here below are the questions I would suggest. I will try to submit this interview as soon I'll have time for this...


  • Profile

    • Whatever can be used to fill a FOAF profile at the end: name , homepage, dc:subject, web accounts, image url, geo-location, etc , etc...

  • Questions copied from Nature-Lifelines

    • What was your first experience as a child ?

    • Whose graduate student would you most like to have been (historical impossibility notwithstanding) ?

    • What single scientific paper or talk changed your carreer path ?

    • What book has been most infuential in your scientific carreer ?

    • What gives you the most job statisfaction now ?

    • What are your major frustrations ?

    • What is your favourite conference destination and why ?

    • What was the worst/most memorable comment you ever received from a referee ?

    • What book is currently on your beside table ?

    • The internet is the bane of scientists'lives because... ?

    • What do you do to relax ?

    • Please, tell us more about the theory, practice and etiquette of knitting during lab meetings. How many stitches/garment have you raked up during meetings in your career as a scientist ?

    • What would you have become, if not a scientist ?

    • What single discovery, invention or innovation would most improve your life ?

    • Do you have a burning ambition to do or learn something of no practical or immediate value ?

    • What is the most interesting thing in your fridge ?

    • Why is physics to hard ?

    • What music would you have played at your funeral ?

    • What's just around the corner ?

    • What would you have written on your gravestone ?

  • The questions that Bernard Pivot always asked his guests in the great French television series, Bouillon de Culture. Some of the finest intellectual minds in the world have responded to them.

    • What is your favorite word ?

    • What is your least favorite word ?

    • What turns you on, excites, or inspires you creatively, spiritually, or emotionally?

    • What turns you off ?

    • What sound or noise do you love ?

    • What sound or noise do you hate ?

    • What is your favorite curse word ?

    • What profession would you absolutely not like to participate in ?

    • If heaven exists, what would you like to hear God say when you arrive at the Pearly Gates ?

  • Questions from Scifoo

    • 5 words or phrases that describe your interests and expertise

    • 5 people we should interview next time (not necessarily bioinformaticians)

21 June 2007

Biomoby & Taverna



AFAIK: Biomoby is a web service , ontology-based (?), for bioinformatics where a central repository which contains the adresses of web services for computational biology. Taverna is a kind of visual pipeline performing some complex bioinformatics tasks.

A few monthes ago, Egon introduced me Taverna and now the latest issue of BMC Bioinformatics contains a paper titled "Seahawk: moving beyond HTML in Web-based bioinformatics analysis": the paper contains many references to the BioMoby project and to Taverna. Seahawk is a Java applet that allows a naïve user to:


  • Load text, HTML, Rich Text, or MOBY XML files from their local disk or a Web site
  • Select all or part of the data displayed (by highlighting or using hyperlinks)
  • Discover then execute services for the selected data
The interface consists of tabbed pane of documents, with MOBY analysis options displayed via popup menus.

BioMoby is indeed something I need to learn: is there anybody in the region of the Evry Genopole, or the UVSQ who could introduce me those technologies ? Thanks a lot.

14 June 2007

Bioinformatics as a Recreation

How can I find a laboratory looking for a small but useful piece of software I would write after work instead of loosing my time in watching TV ? I promise, I just want my name at the 876th position in the authors list :-)

Papiers Aléatoire

I'm starting a new blog called "Papier Aléatoires" (Random Papers). It will contain translations of abstracts in French in order to motivate myself to read more papers and to better understand them (this is my defect: I don't know how to identify a good article: as an example, I remember years ago I read the abstract of THIS paper introducing the siRNA without being interested). Nevertheless I am not certain that I will contribute to this blog on a regular basis.

Contributions are also welcomed: I'll post any french translation of any scholar abstract.
In the future, I may also store the translation in a public (RDF?) file.

Pierre

What should I present at SciFoo 07 ?

My beloved boss told me: "I'm ok with paying your trip to San Francisco if you make a presentation at your conference".
This would be my most expensive talk :-) . But what subject should I present ? Some suggestions have already been posted on the wiki but I wonder what could be of interest from me to all those already talented campers ? Any idea(s) ?

12 June 2007

EyeOS: the Web Desktop

Bye bye Microsoft: Here is EyeOS :a completely free (Open Source) Web Operating System running from your browser, where anybody can collaborate and expand it , and where the files and softwares are stored on a remote server. I'm not sure that it is a good news for the system administrators.



You can try it at http://demo.eyeos.org/

11 June 2007

Nature Scintilla


just like Deepak, I've received an invitation from Euan Adie (thanks Euan) to join the new service from Nature http://scintilla.nature.com/.

Scintilla collects data from hundreds of news outlets, scientific blogs, journals and databases and then makes it easy for you to organize, share and discover exactly the type of information that you're interested in.

For example, you can keep track of life science podcasts, or the latest papers on schizophrenia, DNA methylation or immunology. Interested in physics blogs? Scintilla can help.

Euan is already the author of www.postgenomic.com and the two tools seem to have an identical function at first glance. This also reminds me Aggademia, a tool created and tested by Alf Eaton a year ago.

I just had on overview of this tool but I already I found it interesting to add a pubmed query in my collection of sources. The service is distinct from Connotea and network.nature.com but with those three tools you can create a group, send some invitations (people around me are annoyed with all my invitations) and I hope all of this will be merged in the future.

Shall I use this tool ? I don't know. I already use google-reader , technorati , etc.. to handles my resources, just tell me why I should change.

Science Magazine ? Science Magazine ? Where are you ?


Pierre

10 June 2007

Mapping NCBI/PUBMED

In my previous post I showed how I used the tag <Affiliation> from the XML/pubmed records to extract the mails and the names from the authors of a paper. I've slightly changed the source code of this program to find the country of origin of each paper. To retrieve the country I used:
1) the suffix of the mail (if any)
2) the name of the country (if any)
3) the name of the city (a few famous one such as Standord, for the US or UK)

My program takes as input a pubmed query and the ouput is the number of papers per year and per country. I put a few results on ManyEyes. As an example with the query "Rotavirus" with 1000 records, I was able to retrieve 887 countries.






Publications in "Bioinformatics", "BMC Bioinformatics", "Plos Comp. Biol."







Publications about "Rotavirus"







publications about malaria, anopheles, plasmodium etc...

I will not SPAM Nature Network with NCBI/Pubmed.

This summary is not available. Please click here to view the post.

Protocols & History

In Standards for biological protocols Dror, starts to discuss, how he thinks that protocol standards should look like, and starts building examples of well known procedures, I suggested him to look after RDFa or hGRDDL. However I've never seen a file format describing a protocol and that could be interpreted by a computer.

Hal9000: Hello Dave, how mini-preps do you want to process with the protocol urn:protocol:miniprep:AU87687 today ?
David Bowman: Hello Hal, I wish I could process 96 mini prep's
Hal9000: Dave, you'll need 192 tubes, 400 ml NaOH. The extraction will last 2H00 and will cost 10$. If you start now you will be able to lunch before the canteen is closed.
David Bowman: Thank you Hal, I'll use the centrifuge n°2
Hal9000: Dave, I detect problem with the centrifuge n°2. It should be repaired.
David Bowman: What ??? But we already changed it last week !!


Two blogs about digital history dealt with a parallel of this subject with bioinformatics: here and here. Handling historical data should be as fun as playing with biological data.

When that happens, the only way that historians are going to be able to grapple with this ocean of content will be something that we might call computational history–Bill likens this concept to bioinformatics.


I've already played with the RDF statements from DBPedia to generate an interactive time line about the History of Sciences but I guess there are many other things to do :-)

05 June 2007

Translating DNA to protein with the Google Web Toolkit: My notebook

.

Google Web Toolkit (GWT) is an open source Java software development framework that makes writing AJAX applications like Google Maps and Gmail easy for developers who don't speak browser quirks as a second language.

After the google dev day 2007 I've been playing with the Google Web Toolkit so here is my notebook. In this example I'll show how I used the GWT to create a classical program to translate a DNA sequence into a protein sequence with an option to choose between several genetic codes.

First download the GWT:

pierre@linux:~> cd tmp/GWT/toolkit/
pierre@linux:~/tmp/GWT/toolkit> wget "http://google-web-toolkit.googlecode.com/files/gwt-linux-1.3.3.tar.gz"
pierre@linux:~/tmp/GWT/toolkit> tar xfz gwt-linux-1.3.3.tar.gz


The GWT comes with an executable called 'projectCreator' generating a default project for eclipse. I also had to declare the variable LD_LIBRARY_PATH

export LD_LIBRARY_PATH=/home/pierre/tmp/GWT/toolkit/gwt-linux-1.3.3/mozilla-1.7.12 to make those things work.

pierre@linux:~/tmp/GWT> mkdir test01
pierre@linux:~/tmp/GWT/test01> ../toolkit/gwt-linux-1.3.3/projectCreator -eclipse Test01
Created directory /home/pierre/tmp/GWT/test01/test
Created file /home/pierre/tmp/GWT/test01/.project
Created file /home/pierre/tmp/GWT/test01/.classpath


Another executable creates the default files.
pierre@linux:~/tmp/GWT/test01> ../toolkit/gwt-linux-1.3.3/applicationCreator -eclipse Test01 org.lindenb.gwt.client.Main
Created directory /home/pierre/tmp/GWT/test01/src
Created directory /home/pierre/tmp/GWT/test01/src/org/lindenb/gwt
Created directory /home/pierre/tmp/GWT/test01/src/org/lindenb/gwt/client
Created directory /home/pierre/tmp/GWT/test01/src/org/lindenb/gwt/public
Created file /home/pierre/tmp/GWT/test01/src/org/lindenb/gwt/Main.gwt.xml
Created file /home/pierre/tmp/GWT/test01/src/org/lindenb/gwt/public/Main.html
Created file /home/pierre/tmp/GWT/test01/src/org/lindenb/gwt/client/Main.java
Created file /home/pierre/tmp/GWT/test01/Main.launch
Created file /home/pierre/tmp/GWT/test01/Main-shell
Created file /home/pierre/tmp/GWT/test01/Main-compile


To open your project in Eclipse, launch Eclipse and click the File -> Import menu. Choose "Existing Projects into Workspace" in the first screen of the wizard, and enter the directory in which you genetrated the .project file in the next screen of the wizard.

The file "Main.java" is localized in the package:org.lindenb.gwt.client
The source is available at http://lindenb.integragen.org/gwt/Main.java


First we created an abstract class describing a genetic code:

static abstract private class GeneticCode
{
public abstract String getName();
public abstract char translate(char a,char b,char c);
}



Then we wrote the standard genetic code by extending the previous class

static private class UniversalGeneticCode extends GeneticCode
{
public String getName()
{
return "Universal Genetic Code";
}
public char translate(char c1,char c2,char c3)
{
//trivial ......
}
}


and I wrote a new class fro the Mitochondrial Code


static private class MitochondrialGeneticCode extends UniversalGeneticCode
{
public String getName() {
return "Mitochondrial";
}

public char translate(char c1, char c2, char c3)
{
//trivial too...
}
}



Those codes will be stored in an array.

private GeneticCode geneticCodes[]=new GeneticCode[]{
new UniversalGeneticCode(),
new MitochondrialGeneticCode()

};


Programming the GWT is just as easy as programming with AWT or SWING. For this project we need a
ListBox to choose the genetic code and two TextArea: one for the DNA and the other for the protein.

private TextArea translated;
private TextArea userInput;
private ListBox choiceCode;


The function translateInput will be the workhorse of our class: In this function we get the index of our genetic-code list, we get the content of the TextArea containing the DNA, we translate the sequence using the genetic code and we put the result in the protein-TextArea


private void translateInput()
{
int idx= this.choiceCode.getSelectedIndex();
if(idx==-1) return;
GeneticCode code= this.geneticCodes[idx];
String dna= this.userInput.getText().replaceAll("[ \t\n\r]",&q
uot;").toLowerCase().replace('u', 't');
StringBuffer protein= new StringBuffer(dna.length()/3+1);
for(int i=0;i+2< dna.length();i+=3)
{
protein.append(code.translate(dna.charAt(i), dna.charAt(i+1), dn
a.charAt(i+2)));
if(protein.length()%40==0) protein.append("\n");
}
translated.setText(protein.toString());
}



When we create the list for the genetic codes we add a listener which will call translateInput() when the selection will be changed.

this.choiceCode = new ListBox();


for(int i=0;i< this.geneticCodes.length;++i)
{
this.choiceCode.addItem(this.geneticCodes[i].getName(), String.v
alueOf(i));
}
this.choiceCode.setSelectedIndex(0);
this.choiceCode.setVisibleItemCount(this.geneticCodes.length);
this.choiceCode.addClickListener(new ClickListener()
{
public void onClick(Widget sender) {
translateInput();
}
});


and when we create the TextArea for the DNA, we add a KeyboardListener calling translateInput() everytime a key is pressed.



this.userInput = new TextArea();
this.userInput.addKeyboardListener(new KeyboardListener()
{
public void onKeyDown(Widget sender, char keyCode, int modifiers
) {
translateInput();
}
public void onKeyPress(Widget sender, char keyCode, int modifier
s) {
translateInput();
}
public void onKeyUp(Widget sender, char keyCode, int modifiers)
{
translateInput();
}
});


In the html file there is a <div> element with an attribute id="main-id", this is where the script will insert the code.


RootPanel.get("main-id").add(tab);


The javascript pages are then generated using the script ./Main-compile which was previously generated.

That's it !


It worked fine without any line of javascript !!.

Updated 2010-08-12: source code

package org.lindenb.gwt.client;

import com.google.gwt.core.client.EntryPoint;
import com.google.gwt.user.client.ui.ClickListener;
import com.google.gwt.user.client.ui.HTML;
import com.google.gwt.user.client.ui.HorizontalPanel;
import com.google.gwt.user.client.ui.Image;
import com.google.gwt.user.client.ui.KeyboardListener;
import com.google.gwt.user.client.ui.Label;
import com.google.gwt.user.client.ui.ListBox;
import com.google.gwt.user.client.ui.RootPanel;
import com.google.gwt.user.client.ui.TabPanel;
import com.google.gwt.user.client.ui.TextArea;
import com.google.gwt.user.client.ui.VerticalPanel;
import com.google.gwt.user.client.ui.Widget;

/**
* Entry point classes define <code>onModuleLoad()</code>.
*/
public class Main implements EntryPoint
{
static abstract private class GeneticCode
{
public abstract String getName();
public abstract char translate(char a,char b,char c);
}

static private class UniversalGeneticCode extends GeneticCode
{
public String getName()
{
return "Universal Genetic Code";
}
public char translate(char c1,char c2,char c3)
{

switch(c1)
{
case 'a':switch(c2)
{
case 'a':
switch(c3)
{
case 'a':return 'K';
case 't':return 'N';
case 'g':return 'K';
case 'c':return 'N';
default: return '?';
}

case 't':
switch(c3)
{
case 'a':return 'I';
case 't':return 'I';
case 'g':return 'M';
case 'c':return 'I';
default: return '?';
}

case 'g':
switch(c3)
{
case 'a':return 'R';
case 't':return 'S';
case 'g':return 'R';
case 'c':return 'S';
default: return '?';
}

case 'c':
switch(c3)
{
case 'a':
case 't':
case 'g':
case 'c':return 'T';
default: return '?';
}
default: return '?';
}
case 't':switch(c2)
{
case 'a':
switch(c3)
{
case 'a':return '*';
case 't':return 'Y';
case 'g':return '*';
case 'c':return 'Y';
default: return '?';
}

case 't':
switch(c3)
{
case 'a':return 'L';
case 't':return 'F';
case 'g':return 'L';
case 'c':return 'F';
default: return '?';
}

case 'g':
switch(c3)
{
case 'a':return '*';
case 't':return 'C';
case 'g':return 'W';
case 'c':return 'C';
default: return '?';
}

case 'c':
switch(c3)
{
case 'a':
case 't':
case 'g':
case 'c':return 'S';
default: return '?';
}

default: return '?';
}
case 'g':switch(c2)
{
case 'a':
switch(c3)
{
case 'a':return 'E';
case 't':return 'D';
case 'g':return 'E';
case 'c':return 'D';
default: return '?';
}

case 't':
switch(c3)
{
case 'a':
case 't':
case 'g':
case 'c':return 'V';
default: return '?';
}

case 'g':
switch(c3)
{
case 'a':
case 't':
case 'g':
case 'c':return 'G';
default: return '?';
}
case 'c':
switch(c3)
{
case 'a':
case 't':
case 'g':
case 'c':return 'A';
default: return '?';
}
default: return '?';
}
case 'c':switch(c2)
{
case 'a':
switch(c3)
{
case 'a':return 'Q';
case 't':return 'H';
case 'g':return 'Q';
case 'c':return 'H';
default: return '?';
}
case 't':
switch(c3)
{
case 'a':
case 't':
case 'g':
case 'c':return 'L';
default: return '?';
}

case 'g':
switch(c3)
{
case 'a':
case 't':
case 'g':
case 'c':return 'R';
default: return '?';
}
case 'c':
switch(c3)
{
case 'a':
case 't':
case 'g':
case 'c':return 'P';
default: return '?';
}
default: return '?';
}
default: return '?';
}
}

}

/**
*
Differences from the Standard Code:
Code 3 Standard
AUA Met M Ile I
CUU Thr T Leu L
CUC Thr T Leu L
CUA Thr T Leu L
CUG Thr T Leu L
UGA Trp W Ter *

CGA absent Arg R
CGC absent Arg R
* @author pierre
*
*/
static private class MitochondrialGeneticCode extends UniversalGeneticCode
{
public String getName() {
return "Mitochondrial";
}

public char translate(char c1, char c2, char c3)
{
if(c1=='a' && c2=='t' && c3=='a') return 'M';
else if(c1=='c')
{
if(c2=='t')
{
switch(c3)
{
case 't': case 'c': case 'a' :case 'g': return 'T';
default: return '?';
}
}
else if(c2=='g')
{
if(c3=='a' || c3=='c') return '?';
}
}
else if(c1=='t' && c2=='g' && c3=='a') return 'W';
return super.translate(c1, c2, c3);
}
}

private GeneticCode geneticCodes[]=new GeneticCode[]{
new UniversalGeneticCode(),
new MitochondrialGeneticCode()

};

private TextArea translated;
private TextArea userInput;
private ListBox choiceCode;

public Main()
{

}



/**
* This is the entry point method.
*/
public void onModuleLoad()
{
TabPanel tab = new TabPanel();
tab.setWidth("100%");
tab.setHeight("100%");
VerticalPanel vbox = new VerticalPanel();
vbox.setVerticalAlignment(VerticalPanel.ALIGN_MIDDLE);
vbox.setHorizontalAlignment(VerticalPanel.ALIGN_CENTER);


vbox.add(new Label("My First Test with the Google Web Toolkit"));

HorizontalPanel hbox= new HorizontalPanel();
hbox.setHorizontalAlignment(HorizontalPanel.ALIGN_CENTER);
vbox.add(hbox);

Image me = new Image("http://www.urbigene.com/plindenbaum.jpg");
me.setTitle(me.getUrl());
hbox.add(new HTML("<span style=\"font-size:24pt;\"><a href=\'http://plindenbaum.blogspot.com\'>Pierre Lindenbaum PhD.</a></span>"));
hbox.add(me);

tab.add(vbox, "About");
tab.selectTab(0);


this.choiceCode = new ListBox();


for(int i=0;i< this.geneticCodes.length;++i)
{
this.choiceCode.addItem(this.geneticCodes[i].getName(), String.valueOf(i));
}
this.choiceCode.setSelectedIndex(0);
this.choiceCode.setVisibleItemCount(this.geneticCodes.length);
this.choiceCode.addClickListener(new ClickListener()
{
public void onClick(Widget sender) {
translateInput();
}
});

vbox= new VerticalPanel();
vbox.add(new Label("Genetic Code"));
vbox.add(this.choiceCode);
vbox.add(new Label("User Input"));
this.userInput = new TextArea();
vbox.add(this.userInput);
vbox.add(new Label("Translation"));
this.translated = new TextArea();
vbox.add(translated);

this.userInput.addKeyboardListener(new KeyboardListener()
{
public void onKeyDown(Widget sender, char keyCode, int modifiers) {
translateInput();
}
public void onKeyPress(Widget sender, char keyCode, int modifiers) {
translateInput();
}
public void onKeyUp(Widget sender, char keyCode, int modifiers) {
translateInput();
}
});
tab.add(vbox, "Translate");



RootPanel.get("main-id").add(tab);
}


private void translateInput()
{
int idx= this.choiceCode.getSelectedIndex();
if(idx==-1) return;
GeneticCode code= this.geneticCodes[idx];
String dna= this.userInput.getText().replaceAll("[ \t\n\r]","").toLowerCase().replace('u', 't');
StringBuffer protein= new StringBuffer(dna.length()/3+1);
for(int i=0;i+2< dna.length();i+=3)
{
protein.append(code.translate(dna.charAt(i), dna.charAt(i+1), dna.charAt(i+2)));
if(protein.length()%40==0) protein.append("\n");
}
translated.setText(protein.toString());

}

}