Month: August 2006

Another scientific paper from the Poirrier-Falisse!

Finally, a second scientific paper is published by the Poirrier-Falisse (a first paper for me):

Poirrier JE., Poirrier L., Leprince P., Maquet P. “Gemvid, an open source, modular, automated activity recording system for rats using digital video“. Journal of Circadian Rhythms 2006, 4:10 (full text, doi)

It is still in a provisional PDF version but already available on the web and Open Access (of course)! Here is my BibTex entry. I will upload source code tonight on the project website.

Done some spot picking

Spot picker cameraToday, I did some “spot picking”. In 2D electrophoresis, you disperse proteins in a gel according to their electric charge and mass. You obtain a kind of map of proteins and, if you stain these proteins, you have a map of spots (example here). After some analysis, it could be good to identify some proteins of interest. The problem is that they are in the gel! So, today, I used a robot called “spot picker” that … picks spots representing proteins of interest out of the gel. You can see what a spot picker looks like in my proteomic set on Flickr.

How to fight spam in a wiki?

On Friday, having to wait for a librarian to fetch the old articles I wanted to read, I spent a few minutes removing spam from the AEL wiki. This form of spam is very easy to spot because it’s always the same : <small> HTML tags enclosing 30 links and the text linking to these sites have well-known spam, adult-oriented words in it (see the end of MsSecurity page where I didn’t had time to remove spam).

After the librarian gave me my articles, I went back to my lab, thinking of a possible solution … This kind of spam is constant. Why not writing a simple software bot that will fetch each and every page on a wiki, check if there is some litigious content in it and then going to the next page or cleaning the content. This bot wouldn’t prevent spam but can act quickly after spam (e.g. if you launch it every hour/day/week with cron).

This afternoon, I thought I could not be the only one trying to find ways to fight spam on wiki. Indeed, fighting wiki spam already has a verb, “to chongq” (although it also includes retaliation), a Wikipedia page (event two) and many other dedicated pages.

Basically, there is three types of behaviour to fight spam: wiki-specific methods, general http/web methods and manual actions.

  1. Wiki-specific methods are add-ons to your wiki system that help prevent spammer to modify your wiki. For example, Wikimedia has its anti-spam features and a Spam blacklist extension, TWiki has a Black List plugin, etc. Once set up, you generally do not need to care about them (except to see if they are properly working, to update them, etc.).
  2. General http/web methods use general web mechanisms and/or special features independent from the wiki software you use. These systems are also automated, like Bad Behaviour, use of CAPTCHA images, use of the “rel=nofollow” attribute in link tags, etc.
  3. Finally, manual actions can be taken by any human: removing spam like I did, renaming well-known wiki pages like sandbox, etc. The only advantage of this method is that the human brain can easily adapt itself to new forms of spam. Otherwise, it’s rather time consuming …

Finally, I read that some spam bots are removing spam, but only a part of it. This is the kind of thing I would like to do, but it should remove all the spam. (But before this one, I should begin the simple, geek blog software).

Addedum on August, 21st, 2006: independently of this post, Ploum made an interesting summary of a post from Mark Pilgrim (this post looks rather old: 2002!). In his post, Mark Pilgrim sees two ways of fighting spam: club or lowjack solutions.

With a club solution, your wiki is protected against lazy spammers. Clubs are technical solutions that make it harder for spammers to vandalize your website/wiki/blogs/etc. The Club works as long as not everyone has it. Once everyone had clubs, spammers will think a little bit and update their software to circumvent most of your clubs. In conclusions, “the Club doesn’t deter theft, it only deflects it.”

With a Lojack solution, your wiki isn’t necessarily protected but spammers that will vandalize it will be traced back. “Although it does nothing to stop individual crimes, by making it easier to catch criminals after the fact, Lojack may make auto theft less attractive overall.”

My bot that completely removes spam is definitely not a lojack. But it’s not a club neither. This tool will allow you to be spammed and it will not trace spammers back. Still, it will be less attractive for spammers to add links on wiki since they will be removed soon after being added.

(Btw, I’ve just noticed that comments were automatically forbidden for any post. That was not intentional)

Part of wish list for Christmas ;-)

Since a few months, I am telling myself that I need a new mobile phone to replace my old Nokia 3410 (battery is nearly dead, some keys are not working all the time, etc.). Yesterday, Trolltech, the company behing Qt, announced the shipping of a green Linux-based mobile phone in September 2006. I don’t know if it will be a true product or “only” a development tool (a kind of prototype for developers). But I know that I want one if it’s available.

"Why groupthink is the genius of the internet"

In the August 10th, 2006 issue of Financial Times [1](*), Patti Waldmeir wrote a column about a new book [2] she recently read.

In this book, Sunstein start from a 1973 citation from F. Hayek, a liberal philosopher and economist:

Each member of society can have only a small fraction of the knowledge by all and … civilisation rests on the fact that we all benefit from knowledge which we do not possess.

While Sunstein knows the potential flaws of today internet collaborative projects (wikis, blogs, etc.), he argues that “sharing scientific information online would cure some of the worst problems of the U.S. patent system and foster innovation much more efficiently than costly patent litigation”.

Before the internet, we used to look for solutions by asking family or neighbours. Now, we are looking on the internet where people genuinely wants to communicate their knowledge. Groupthinking may be “the genius of the internet”, it already was the genius of any group, with or without computers and network.

Hmmm … Anyway, this author seems interesting to read …

[1] Waldmeir, P. “Why groupthink is the genius of the internet“. Financial Times, August 10th, 2006, p. 5 (article unavailable without subscription)

[2] Sunstein, C. “Infotopia: How Many Minds Produce Knowledge“. Oxford University Press, October 2006 .

(*) I am taking advantage of a free 4-weeks subscription to the Financial Times. That’s why it’s my second post about an article published in this journal. But I don’t think I’ll subscribe: 1. I have other things to read ; 2. business and finance are not in my core business ; 3. I don’t understand half what they wrote (especially in the “Market data” and pages alike).

P.S. When you read F. Hayek’s biography on Wikipedia, this political philosopher also made an inroad in cognitive science, independently developing an alternative “Hebbian synapse” model of learning and memory. Another interesting author to read …

Happy Independence Day!

Today, 15th of August, is the Independence Day of India. Happy Independence Day!

Happy Independence Day, pic from ISAL
Picture from ISAL

Today, the Indian Embassy in Belgium held a very small ceremony (but I was in my lab at that moment ; MS-Word invitation). I guess it is/was celebrated all over India (more pictures on the Times of India website).

Bozar India Festival, Oct06-Jan07By the way, Bozar are organizing an India Festival from October 2006 to January 2007. You’ll enjoy expositions, listen to music, see theatre plays, listen to literature, watch cinema and dance, both from old-style India and from modern India. If I have to pinpoint one event, it will be the Dhrupad concert where my father-in-law will sing with the Gundecha Brothers (dhrupad on Wikipedia). It will be on Wednesday 17.01.2007 at 20:00. But, of course, there will be many more great artists from India …

Links: Original and GEGL

While looking for something totally different (what exactly is the Mascot score? Partial answer here), I found the Original photo gallery, a two-parts tool to get digital photos on the web. This tool could be interesting for the family website I plan to build. Since my hosting company enabled PHP Safe Mode, I cannot use most on-line gallery tools. Original seems to be an ideal solution because all the treatments are done off-line, on my own PC. Then everything is loaded on the website. Still, it’s not a static gallery like the one Picasa does (for example).

I also saw GEGL (Generic Graphical Library), “an image processing library for on-demand image processing. It is designed to handle various image processing tasks needed in GIMP.” They just released a first version. Øyvind KolÃ¥s is also maintainer of Babl, a “dynamic, any to any, pixel format conversion library”. I am interested in all kind of (free as in free speech) image processing libraries because I’m trying to correctly open and manipulate .gel files (see problems written in this forum thread ; basically, they are .tiff files with additional tags and a different way to encode pixels).

P.S. I quickly installed and tried Istanbul, as I previously planned to do. It’s working for a few seconds of recording but then it stops. I didn’t have time to see what’s wrong (I think my resolution is too high: 1280×1024).

P.P.S. Oh, yes … And I suggested and became the new contact person for the French team for LinuxFocus. I would like to thank Iznogood for the work done before me and I hope he will stay active in the free software community. I will upload my last translations as soon as possible and try to put some pep in the French-speaking team.