Kolam ritual @ Bozar

On Saturday, we went to see a Kolam ritual at Bozar. Kolam is the designs the Pulluvans are drawing on the floor for a lot of occasions, using multicoloured sands, rice and spices. Here, it was supposed to be a ritual for a family (“supposed” only because it was a demonstration for the public and no family was specifically involved). In this ritual, two women in a trance erase the drawings and answer questions the family is asking. The whole ceremony is linked to snakes that are supposed to have been in Kerala before men and that should be pleased in order to peacefully live together.

I took some photos but, unfortunately my camera isn’t good without flash (*). This ritual was part of the India Festival at Bozar. If you want some suggestions of things to see (dance, theatre, music, litterature, exhibitions, etc.), my father-in-law gave some here (in French).

(*) I found that people taking photos with flash during this religious ritual (even after the commentator specifically asked not doing so) are very coarse and insulting for the artists.

Symposium on Neuroproteomics in Gent

This friday, I attended the Symposium on Neuroproteomics organised at the University of Gent (B). Apart from Deborah Dumont‘s excellent talk, lectures were almost only focused on oxydative stress, neurological diseases and gel-free proteomics (like 2D-LC). One speaker even seemed to talk only to his computer or his presentation. So, it was not very interesting for me (finishing my thesis based on gel proteomics). The organisation was very “basic” and we even didn’t have any free pen + paper (fortunately, I took two pens and a notebook).

Dasher: where do you want to write today?

Hannah Wallash put their slides about Dasher on the web (quite the same as these ones from her mentor). Dasher is an “information-efficient text-entry interface”.

What made me interested in Dasher is her introduction about the way we communicate with computers and how they help us to communicate with them. There are keyboards (even reduced ones), gesture alphabets, text entry prediction, etc. I am interested in the ways people can enter text on a touch-screen, without physical keyboard. Usually, people use a virtual keyboard (like in kiosks for tourists or in handheld devices). But they are apparently not the best solutions.

They come with an interesting way of entering text, where pulling and pushing elements on screen are used to form words (with the help of the computer that is “guessing” the words from the previous letters). It requires a lot of visual attention but this can be turned into a feature for people unable to communicate with hands (for physical keyboard and mouse ; one man even wrote his entire B.Sc. thesis with Dasher and his eyes!).

You can download Dasher for a wide range of operating systems and even try it in your web browser (Java required) (btw, it’s the first software I see that adopted the GNU GPL 3). After reading the short explanation, you’ll be able to easily write your own words, phrases and texts.

They are interested in the way people are interacting with the computer. They are using a language model to show the next letters. On the human side, I am wondering if this kind of tool has an influence on how the human brain works. Visual memory should be involved in physical keyboard (“where are the letters?”) but also here (same question but the location of letters is changing all the time). Here, letters are moving but one can learn that boxes are bigger if the next letter probability is bigger. How is the brain involved in such system? What is it learning exactly? Are there fast and slow learners in this task? It could be interesting to look at this …

Proteom'Lux 2006

From the 11th to the 14th of October, I was at Proteom’Lux 2006, an international conference on proteomics held in Luxembourg. I presented a poster, learned quite a lot of information, met a lot of very interesting people and have now a clearer view on the directions and additional details needed in the proteomic part of my work. Some people presented some interesting new ideas (QconCat, MS-Blast, …). I am still assimilating all this information …

The conference schedule was very “tight” but, when I took some time to visit Luxembourg city a little bit, I found a place where I really wanted to enter 😉 Finally, it was also the first time I saw a biology lecture done in LaTeX Beamer (organizers also had a laptop with Ubuntu and an OpenOffice.org Impress presentation but it wasn’t apparently for a lecture).

Town and province elections in Belgium

Today, we were required to vote for the Belgian town and province elections (voting is mandatory in Belgium). A certain percentage of polling stations used an electronic voting technique. After identification, a person gives you a (presumably blank) magnetic card, you enter a voting booth, insert the card into a computer and, with a stylus, you point on a screen. The screen mimics a paper used in the old-fashioned way of voting: white circles on the left of a candidate’s name. After you voted, the computer gives your card back and you simply put it in a ballot box. If you want to know more about potential problems with electronic voting, you can look at Poureva, Recul dÃ©mocratique and the Wikipedia article about electronic voting, e.g..

While looking for articles about electronic voting, I found this one about the perception of electronic voting in Belgium : “Electronic Voting in Belgium: A Legitimised Choice?” [1]. They tried to figure out:

how easy/difficult it was for electors to vote on a computer: apparently, the vast majority of people find the procedure easy.
to what extent they trust voting on a computer: apparently, a majority of people find this voting system trustworthy
if they have a philosophical/social opposition to voting on a computer: apparently, nearly no one has an opposition to voting on a computer.

Although I don’t criticize their method, I don’t totally agree with their conclusion:

it seems quite clear that the introduction of electronic voting in Belgium is a choice that is legitimised by the vast majority of the population. More than 80 per cent answered positively to the questions regarding ease of use, confidence and social acceptance with regard to the new method of voting.

To legitimse is “to make legitimate”, “to make (something) legal or acceptable”.
Legitimate = “allowable according to law, or reasonable and acceptable”.

One has not to talk about the legality of electronic voting since, as they said, a Belgian law was passed to establish it. But we can ask on the way politicians passed this law. Do they asked citizens about it? Have they said that in their programme before election? The vast majority of the population can think that the electronic voting system is acceptable. But are they fully informed of all the ins and outs of such systems? If the governement absolutely needs an electronic way of voting, can’t it give the voters a way to verify the vote and observers a way to verify the votes after the election?

A simple way could be a small printer in the voting booth, printing the voter’s vote as soon as she/he takes back her/his card. This way, the voter can check if the machine correctly printed the vote. This vote is supposed to be the same in the magnetic card. And if one observer has any doubt about it (or anything else), one can still re-count votes with papers. The magnetic cards give a simple way to quickly obtain the results. The papers give a simple way to recount votes.

[1] P. Delwit, E. Kulahci and J.B. Pilet. “Electronic Voting in Belgium: A Legitimised Choice?” Politics, 25: 153-164 (copy).

Free communication at the BASS Autumn Meeting

I went to Gent, last Friday, to the BASS Autumn Meeting. With “New drugs for sleep” as a title and mainly physicians and psychatrists in the audience, I didn’t expected to have a lot of “basic science” presentations but the University of Liege was well represented by T. Dang-Vu, P. Peigneux, C. Schmidt and me in the free communications section (btw, we are all four from the Cyclotron Research Center). I outlined some recent findings on proteins differentially expressed after a short-term sleep deprivation. I had a nice question from Prof. Verbraecken (UZA) and, next time, I’ll focus more on pathways and physiological implications of proteins found rather than on functions only.

Looking for a good free UML2 modelling editor …

I was using Poseidon as a modelling editor for my UML2 diagrams. It was based on Java and I was able to run it from both GNU/Linux and MS-Windows. It was not free software but the Community Edition was free (as in “free beer”) and has all the tools I modestly needed. The only trick: all the diagrams had a string in the bottom, stating it was not meant to be used for commercial purpose (for educational purpose, I’ve written a small software that removes it).

Today, Gentleware boss announced that Poseidon will go away. He said it will be replaced by Apollo for Eclipse and by a new licensing model (renting) starting at 5€ per month. An unregistered version will be available but it won’t be possible to export, print, save, etc.

First, this shows one of the problems of using free-as-in-free-beer-but-proprietary software (and not really free software): the owner can change the licence, the software availability and usage conditions at any time. Secondly, althought I understand the move from a commercial/business point of view (if they need money) but I wonder if they are not depriving themselves from a potential user base (Community Edition users that will recommend paid version in a professional environment).

Anyway, I am now looking for a new, good and free UML2 modelling editor. After a quick search, I’ve found:

ArgoUML, a Java-based editor supporting UML1.4, able to import/export Java code (*) (BSD licence)
Umbrello UML Modeller, written for KDE only, it can import/generate code from/for Python, Java, C++, … (GPL)
BOUML is also Java-based and I think it’s the only one in this list that supports UML2 ; it can generate code for C++, Java (and Idl) (GPL)
PyUT is a class diagram editor written in Python and supports UML1.3 ; it can import/export Python and Java source code and export C++ code (GPL)

I really don’t have time for the moment to test all these software. As soon as I’ll have time, I’ll give them a try. Meanwhile, if you have other suggestions and/or any experience with one of them, please feel free to post a comment.

(*) Although I am not confortable with code auto-generation tools, the ability to import/generate code for a programming language is a good indication of the ability of the modelling tool to understand and take into account this language specificity. You don’t want Java syntax highlights when developping a Python application.

RNA-oriented Nobel Prizes

On 6 Nobel prizes, 2 were awarded to people involved in research about RNA. The 2006 Nobel Prize in Medicine was awarded to Andrew Fire and Craig Mello “for their discovery of RNA interference – gene silencing by double-stranded RNA”. And the 2006 Nobel Prize in Chemistry was awarded to Roger Kornberg “for his studies of the molecular basis of eukaryotic transcription”.

RNA interference is a mechanism where a “double-stranded ribonucleic acid (dsRNA) interferes with the expression of a particular gene”. And transcription is basically the process through which a DNA sequence is copied to produce a complementary RNA.

Some years ago, everyone was talking about genomics, the study of genes. Now people working on RNA win Nobel Prizes. Knowing that DNA is transcripted into RNAs and that some RNAs (mRNA) are later translated into proteins, I predict that we’ll see a future Nobel Prize in proteomics, the study of proteins. 😉 Ok, Fenn, Tanaka and WÃ¼thrich already won a Nobel Prize in 2002 for MALDI and NMR mass spectrometry, a technique used, a.o., to identify proteins. And Blobel won a Nobel Prize in 1999 for protein targeting.

Playing with Python and Gadfly

Following my previous post where I retrieved EXIF tags from photos posted on Flickr, here is the next step: my script now stores data in a database.

There is a lot of free wrappers for databases in Python. Although I first thought of using pysqlite (because I am already using SQLite in another project), I decided to use Gadfly, a real SQL relational database system entirely written in Python. It does not need a separate server, it complies with the Python DBAPI (allowing easy changes of DB system) and it’s free.

Using Gadfly is very easy and their tutorial is very comprehensible. Put toghether, here is an example of the creation of a database, the addition of some data and their retrieval (testGadfly.py):

#!/usr/bin/python
# test for Gadfly
import os
import gadfly
import time

DBdir = 'testDB'
DBname = 'testDB'

if os.path.exists(DBdir):
    print 'Database already exists. I will just open it'
    connection = gadfly.gadfly(DBname, DBdir)
    cursor = connection.cursor()
else:
    print 'Database not present. I will create it'
    os.mkdir(DBdir)
    connection = gadfly.gadfly()
    connection.startup(DBname, DBdir)
    cursor = connection.cursor()
    cursor.execute("CREATE TABLE camera (t FLOAT, make VARCHAR, model VARCHAR)")

print 'Add some items'
t = float(time.time())
cmake = 'Nikon'
cmodel = 'D400'
cmd = "INSERT into camera (t, make, model) VALUES\
    ('" + str(t) + "','" + cmake + "','" + cmodel + "')"
cursor.execute(cmd)

print 'Retrieve all items'
cursor.execute("SELECT * FROM camera")
for x in cursor.fetchall():
    print x

connection.commit()
print 'Done!'

Regarding the initial project, the script became too long to be pasted in this post but you can download it here: flickrCameraQuantifier2.py (5ko). To run it, you should have installed the wrapper for EXIF and Flickr and the Gadfly DB system. In the beginning of the script, you can define the number of iterations (sets of queries) you want in total (variable niterations), the sleep duration between queries (variable sleepduration) and the number of photos to get for each query (variable nphotostoget). Everything will then be stored in a Gadfly database (default name: cameraDB). If you want to read what is stored, here is a very basic script: flickrCQ2Reader.py.

For example, I’ve just asked 125 queries (with 5s between each query). I’ve got 88 photos (70.4% of queries) with 27 photos without EXIF tags (30.68% of all the photos). Among all the camera makers, Canon has 27%, Fuji has 11%, Nikon has 18% and Sony has 21% of all the photos with EXIF tags at that moment. This is approximately what Flagrant disregard found. I don’t have time anymore but one could improve the data retrieval script in order to automate the statistics and their presentation …

Edit on October 9th, 2006: added the links to the missing scripts

Playing with Python, EXIF tags and Flickr API

Some days ago, I was quite amused by Flagrant Disregard Top Digital Cameras: these people daily took 10000 photos that were uploaded on Flickr and looked at the camera makes and models of these photos. This kind of study is interesting because one can see what people are actually using and what camera models can give good results (with a good photographer, of course). I was just disappointed by the fact that they are not saying anything about their sampling method nor the statistics they can apply to their data. I then thought that I can do a kind of survey like this one and publish results along with the method.

One more time, I’ll do this with Python. Instead of reading binary data from JPG files to look at EXIF tags, I’ll use external “modules” (wrappers). After a small survey on the web, it seems that GeneCash’s EXIF.py was the best solution. Indeed, to get the camera make and model of a test image, the code is simply:

import EXIF f = open('testimage.jpg', 'rb') tags = EXIF.process_file(f) print "Image Make: %s - Image Model: %s" % (tags['Image Make'], tags['Image Model'])

Now, to access Flickr most recent photos, I had two options:

I open the Flickr most recent photos page and parse the HTML in order to get the photo. This can be done with regular expressions or XML parsing.
I use the Flickr API where there is a specially designed method: flickr.photos.getRecent

I chose the second option and looked at the three kits for Python referenced by Flickr:

FlickrClient author admits his kit is outdated and gives a link to Beej’s Python Flickr API
Beej’s Python Flickr API seems to be interesting but there isn’t much documentation and, being a beginner in Python, I was quickly lost
Finally, James Clarke’s flickr.py seemed to be a nice and easy to use wrapper. So, I decided to go with it.

Unfortunately, the getRecent method isn’t implemented (James Clarke did not maintained this wrapper since 2005). I tried to use the photos_search method (wrapper for flickr.photos.search method), hoping that using it without any tag will give me all the most recent photos. But some people probably thought of it before me because Flickr disabled parameterless searches. Look at the error:

import flickr z = flickr.photos_search('', False, '', '', '','', '', '', '', '', '2', '', '')

Traceback (most recent call last): [...] FlickrError: ERROR [3]: Parameterless searches have been disabled. Please use flickr.photos.getRecent instead.

So, I was forced to implement the getRecent method. Fortunately, it wasn’t too difficult. Here is the code you can insert at line 589 in James Clarke’s flickr.py (or download my flickr.py here):

def photos_getrecent(extra='', per_page='', page=''):
    """Returns a list of Photo objects.

    """
    method = 'flickr.photos.getRecent'

    data = _doget(method, API_KEY, extra=extra, per_page=per_page, page=page)
    photos = []
    if isinstance(data.rsp.photos.photo, list):
        for photo in data.rsp.photos.photo:
            photos.append(_parse_photo(photo))
    else:
        photos = [_parse_photo(data.rsp.photos.photo)]
    return photos

Now, I have Python, an EXIF wrapper and a Flickr wrapper with a getRecent method, I can write a small script that fetch the 10 most recent images from Flickr and display their camera make and model (if they have one) (flickrCameraQuantifier.py):

#!/usr/bin/python
import urllib2

import EXIF
import flickr

recentimgs = flickr.photos_getrecent('', '10', '1')

imgurls = []
for img in recentimgs:
    try:
        imgurls.append(str(img.getURL(size='Original', urlType='source')))
    except:
        print 'Error while getting an image URL'

for imgurl in imgurls:
    imgstream = urllib2.urlopen(imgurl)
    # save the image
    f = open('tmp.jpg', 'wb')
    for line in imgstream.readlines():
        f.write(line)
    f.close()
    # get the tags
    f = open('tmp.jpg', 'rb')
    try:
        tags = EXIF.process_file(f)
        if len(str(tags['Image Make'])) > 0:
            if len(str(tags['Image Model'])) > 0:
                print "Image Make: %s - Image Model: %s" % (tags['Image Make'], tags['Image Model'])
            else:
                print "Image Make: %s" % (tags['Image Make'])
        else:
            print "No Image Make nor Model available"
    except:
        print 'Error while getting tags from an image'
    f.close()

print "Done!"

Out of 10 images, it usually can give 7-9 camera models. I didn’t checked yet if errors are due to my script or the lack of EXIF tag in submitted images. The EXIF tag detection is a bit slow (imho) but it’s ok. And it’s a “one shot” script: once it finishes it work, nothing remains in memory. So, the next step is to use a flat file or a database connection to remember details found.

I suggest the following method: every 5 minutes, the script retrieves the most recent photo uploaded on Flickr and store the camera make and model somewhere. Each day, one would be able to do some decent statistics. I prefer a sampling of 1 photo every minute rather than 10 photos at one precise moment because people usually upload their pictures in batch processes. There is then a risk that these 10 photos are from the same person and taken by the same device.