Month: May 2007

A small post from Luxembourg

A small post from the Benelux Sleep Congress 2007 (where they left two unprotected wifi networks near and in the congress hall ๐Ÿ™‚ ). It’s mainly a medical congress but I had very interesting discussions with, a.o., Prof. Peter Meerlo and Dr. Michel Cramer Bornemann, mainly about Gemvid and the proteomic aspect of my Ph.D. Let’s see what can I do with all these contacts …

Otherwise, the Domaine Thermal of Mondorf-les-Bains is a beautiful place (ok, it’s not as natural as landscapes aroung the highway and national roads going to Luxembourg). Unfortunately, I left my camera at home (and it’s really a stupid decision taken in this morning rush).

Another reason why free software matters

This morning, I read Tim Anderson‘s “Why Microsoft abandoned Visual Basic 6.0 in favour of Visual Basic .NET“. While reading his article, I only had one idea in mind: this is another example of the importance of free and open source software. If you are not a programmer, you don’t need to read the remainder of this post; software users have many other reasons to prefer free software over closed-source software (but it’s not the subject of this post).

Basically, Visual Basic was abandoned because it was based on an old library, it lacked object-orientation and it was limiting programmers creativity compared to other languages available at that time (C++, Delphi, etc.). Per se, shortcomings in programming languages are not unusual and, considering the origin and goal of (Visual) Basic, people could have forecast this giving up. What is interesting to see is that Microsoft took the somehow brave decision to break the compatibility with previous Visual Basic. This left recreational as well as professional programmers as well as companies in a strange zone where they need the old Visual Basic for sometimes critical software but without any support anymore for the language and its extensions. Of course, Tim Anderson added that you can still write these extensions (the infamous .DLLs) in other languages and link them with Visual Basic code. And there is informal support from The Community

Things could have been better if the language was standardized and, even better, if the language specifications and evolution were open to the community.

First, standardization … C, C++ and Javascript (for example) are standardized since a long, long time (longer than Visual Basic even exists) and they are still largely used. Depending on sources, their usage may decrease or be stable but, along with Java, C and C++ are the only languages to be used by more than 10-15% of projects. Standardization means specifications are available and people and companies can build their own compiler (C, C++), implement their own interpreting machine (Javascript in browsers) and create their own libraries around these languages. There is no fear your favorite IDE/compiler/… will be abandoned because if it happens, you will always be able to use other tools and continue to use the same language.
But standardization isn’t the only keyword. C# and a part of .Net are also standardized but there is a “grey zone” where you don’t know what is patented (yes, another bad thing about software patents) and by whom. This uncertainty may hamper the development of alternatives to the “official” compiler/IDE/… (see discussions about the Mono project on the internet).

These problems about patents and who owns what leads to another important aspect of a programming language: how it is developed. If one company holds everything (like Visual Basic, C# and Java), it is very difficult to suggest improvements, submit them or submit bug reports, etc. And once the company decides it has no more interests in this language, all efforts and developments are slowly becoming useless, as we see with Visual Basic now. Languages like Perl and Python are openly developed by the community. Usually efforts are catalysed by a steering committee (or a structure like that). But in case this committee unilaterally decide to halt the development of a language, source code and other discussion can still be used by other people.

So, if you can choose and even if some languages are are in the hype for the moment, choose a free/open language and your efforts to study it, to find tricks and details will never be nullified by a third party decision.

Re-examinated patents are still valid?

In “Patenting the obvious?” (pdf), you’ll read about people fighting against a patent on methods for making embryonic stem cells from primates. I won’t go into the details about the patent in itself (although I think that there shouldn’t be any patent based on or containing living “things” or part of it). I just want to share my surprise when I read this (emphasis is mine):

In its 2 April statement, the patent office said that it accepted these arguments, and intended to revoke the patents. WARF has until June to respond to the decision, and if it is unhappy with the outcome, it can then initiate an appeal. The patents will be treated as valid until the re-examination process is complete รขโ‚ฌโ€ that is, until WARF’s response and the possible appeal have concluded. That could take years.

In other words, you can fill as many patents as you want, even if they are stupid, even if you don’t disclose prior art (making your patent irrelevant): even if your patent is challenged, it will be valid for years! How is it possible?

I’m not a lawyer but I looked for details about the classical appeal procedure … Let’s say 2 people enter the court, one as a defendant and the other as a plaintiff. Both are supposed innocent. After the trial, if the settlement does not please one of the parties, one can fill a notice of appeal and the whole thing will be heard by the next higher court with jurisdiction over the matter. During the appeal, both parties remain presumed innocent! It is like nothing happened and everything has to start again. Apparently, in the US Patent and Trademark Office, it’s not the case: once they took a decision, it cannot be changed until the end of all appeal. I think this is also wrong.

Mapping cameras in Liege, 1 month later

Nearly a month after the initial launch of my map of CCTV cameras in Liege, quite a number of people contributed to add cameras on this map (some people contributed heavily ; thanks to all of them). Currently, we have identified 79 cameras but it seems we are far from finishing this work since, according to some sources, there were more than 109 CCTV cameras in Liege at the end of 2006! Did this put you at rest: 40 cameras added in one go?

Disappointed by BSN meeting

Disappointed by this BSN meetingI’m very disappointed by this BSN meeting. This event is organised every 2 years so you might expect some quality standards. Well, don’t expect too much … (don’t expect anything, in fact).

Morning talks were ok, nothing more: it was not better nor worse than any other congress. But the poster session was not organised at all and there was no support from senior scientists … Moreover, authors of about 1/3 of posters didn’t even deign to come and hang a poster! Most of senior scientists left before the afternoon poster sessions (usually, questions from seniors are more useful than other students’ questions); maybe 2-3 seniors were left (for the whole Belgium!!!). And the final touch, lunch was not free (not even sandwiches!) although we paid 45€ for registration (free for members – membership is 12€ per year for students). Instead we were redirected to the UAntwerp canteen … Are they not smart enough to find a sponsor? I think it would have been better to attend the Neuroinformatics Meeting only.

We decided to leave at 16.00: on the 200 people there at 14.00, approximately 20 were left, most of the posters were already removed and no one was there anymore 1) from my lab 2) to have interesting discussions (apart from the weather topic). It’s sad for the last speaker but, hey, if the meeting organising committee is treating us like s..t, what do you expect? We spent at least one day designing our posters (I know people who spent the whole week on it!) and no one is there to discuss about it? We spent hours on highways and traffic jams trying to enter Antwerp at 9.00 to be at the same level as inept people not even coming to hang their poster? I’m sure they even didn’t designed their posters. And the result is the same for everyone: 1 line on the cv. I’m really disappointed.

Note: Alexandre Dulaunoy already left a comment on one of my photos of the meeting on Flickr.

How are you using tags?

I’m wondering how people are using tags and how it differs from keywords usage in the scientific literature.

Usually when I add tags on web services like or Flickr, I tend to add as many tags as possible. For example, even if a man is not the main subject of a photo, I’ll use the tag “man”. The rationale is we never know if, one day, I (or someone) would like to find a photo with a man and a tree (for example), the tree being the main subject. The problem is that I think I’m “diluting” the power of main tags. Another example … about a website helping find post-doc jobs, I’ll use the following tags: “jobs postdoc research science grants PhD job”. The problem is that “grants” is not really related (there are no list of available grants but only some jobs require grants and you never know what you’ll look for later).

In the biomedical sciences field (and many other scientific fields, I guess), we are using “keywords” when submitting a paper to a peer-review journal. This helps in the selection of peer-reviewers but, more important, it allows us to find interesting papers. The main difference with tags, imho, is that we only use a small number of keywords. For example, in this article, the author only used 4 keywords (and it’s considered sufficient). If this article would have been a webpage, I would have added some more tags: MS, Mw, pI, proteomics, …

Why is there a difference? Is it relevant? How are you using tags? Is there a “good” strategy?

I collected tag-lists from some users and tried to compare (*) to my tag list …

user abbrev N links N tags Mean citation per tag Max citation for a tag
je 401 757 3.52 84
ad 614 1123 3.07 98
do 113 195 2.33 27
ch 2320 582 15.86 326
de 3528 1550 13.29 923

With 5 people, I don’t pretend that it’s significant … We have clearly two groups: me and my friends (the 3 first lines) with < 1000 links and a mean citation per tag of around 2-3. The two last lines are from 2 people taken “at random” (well, I eliminated people with < 1000 links like the 1st group). When I plot the histogram of tags usage, I always get the same trend: a huge amount of tags used a few times and very few tags cited very often (as expected, see figure below).

histogram of my tags usage

Rashmi Sinha’s cognitive analysis of tagging is a good start to understand the tagging process. But it could be nice to find other important ressources and/or learn from others experiences …

(*) data and Python scripts available upon request. I had to write my own Python scripts to retrieve data since, unfortunately, Michael G. Noll’s Unofficial Python API for research are not available anymore.

Update on May 6th (a bit later): Michael’s API is back! I’ll use it later ๐Ÿ™‚ Thanks Michael. If you want to spend your holiday in Canada, you can go to the ACM Document Engineering 2007 where he’ll introduce a paper related to this subject. Another thing: when I looked again at the table above, there are two “trends” (remember, I don’t pretend to be exhaustive nor significant): people with < 1000 links have more keywords than links ; with > 1000 links, more links than keywords. Is there a more precise limit? I guess this has something to do with the fact people are only interested in a “small” number of subjects and tend to collect as many variations (links/webpages) as possible on the subject.