Category: Open Access

Open Access week: October 24-31, 2011

For once, I won’t write about a day here but about a week: this week is the Open Access week (OA week). In this fourth edition, it’s not time anymore to explain one more time what is Open Access (but if you still want to read about it, read the Wikipedia article or Peter Suber’s overview). This year, this week is defined as “an opportunity […] to continue to learn about the potential benefits of Open Access, to share what they’ve learned with colleagues, and to help inspire wider participation in helping to make Open Access a new norm in scholarship and research“.

I was curious about what is the state of Open Access in Belgian universities. On the OA week website, only two Belgian events were registered: one workshop centered around pending issues in the management of institutional repositories (organized by the national science funding body – FNRS) and one stream of activities at the University of Liege (yipee, ULg is my Alma Mater!). The new thing (at least for me) is that both event are either captured on video or launched with a video. The launch video from the University of Liege includes interviews with researchers telling how Open Access helps them and others (it’s a pity the entry page for its library network is still the same as ten years ago).

Paul Thirion about the OA week at the University of Liege
Paul Thirion about the OA week at the University of Liege

If you look for the information, you’ll find that the University of Ghent is also participating in the Open Access week, with two professors describing Open Access in videos in Dutch and a website about it (in collaboration with the ULg): http://www.openaccess.be.

Other Belgian universities usually support Open Access without any specific action about this week (except ULB with a recap mainly on the financial benefits). Is it a sign that Open Access is losing momentum or just became part of everyday life in universities?

An update on JoVE

Sorry We're Closed by bluecinderella on FlickrThree years ago, I wrote about JoVE, the Journal of Visualized Experiments. JoVE was a peer reviewed, open access, online journal devoted to the publication of biological research in a video format. I recently discovered that since 2009, JoVE is now just a peer reviewed, open access, online journal devoted to the publication of biological research in a video format. You can debate at length on whether JoVE was Open Access (as I thought) or not. I just think it’s sad although I understand their motives: in a recent exchange with them, they wrote they “handle most production of our content [themselves] and it is a very very costly operation”.

The recent exchange I had with Jove was about another previous post describing a way to store the videos locally, as anyone would do with Open Access articles in PDF format. I was unaware of two things:

  1. JoVE dropped the “Open Access” wording as I wrote above (however, there is still a possibility to publish a video in free access for a higher fee, as described as “Open access” in the About section for authors);
  2. the “trick” was still working (and people at JoVE seemed to be aware of that and I saw similar description of the trick elsewhere).

Unfortunately, this trick will not work anymore in the coming weeks since they will “do token authentication with [their] CDN“. JoVE will remain for me a very interesting journal with videos of quality and without any equivalent yet (SciVee doesn’t play in the same playground and I wonder why Research Explainer missed the comparison in their 2010 interview).

I was then wondering what could have been the impact of this decision on the number of videos published in JoVE as free access. I didn’t find any statistics related to this on the JoVE website (unrelated thought: I like the way BioMed Central gives access to its whole corpus). I then relied on PubMed to find all the indexed articles from JoVE and relied on its classification of “Free Full Text” (i.e. copied on the PubMed Central website, including the video). At the time of writing (August 2011), on a total of 1191 indexed articles, 404 are “Free Full Text”. This is nearly 34% of all JoVE articles. When you split this by year since 2006 (when JoVE went online), you obtain the following table and chart:

Year All articles Free Full Text articles Note
2006 18 18 Full free access
2007 127 127 Full free access
2008 115 87
2009 217 118 Introduction of Closed Access
2010 358 42
2011 356 12 So far (August 2011)
2011 534 18 Extrapolation to full year keeping the same proportion

Total number of articles and free full texts in JoVE

As we can see on the left chart, plotting the total number of articles in JoVE -vs- time, there is a steady increase in the number of articles since 2006. This tend to prove that more and more scientists enjoy publishing videos. It would be nice to have access to JoVE statistics in order to see if there is the same increase in the overall number of views of all videos. With “web 2.0” and broadband access in universities, I guess we would see this increase.

However, as we can see on the right chart, plotting the percentage of JoVE “Free Full Texts” in PubMed -vs- time, there is a dramatic decrease in the percentage of Free Full Texts in JoVE since 2008-2009. Less and less videos are published and available for free in PubMed Central. This is unfortunate for the reader without subscription. This may also be unfortunate for the publisher since there are less and less authors over time who pay the premium for free access. But since authors also pays for closed access, there is certainly a financial equilibrium.

Some methodological caveats … The PMC Free Full Texts are not necessarily in free access on the JoVE website (and vice-versa ; all the ones I checked are but I didn’t check all of them!). This might explain why there is already a reduction in Free Full Texts in PMC in 2008 while JoVE closed their journal in April 2009. I expected the same proportion of free articles published until the end of 2011 than in the beginning of 2011 ; this might not be the case (let’s see in January 2012 ; this also leads to the question: “is there a seasonal trend in publishing in JoVE?”).

What I take as a (obvious) message is that if authors can pay less for the same publication, they will, regardless of how accessible and affordable the publication will be for the reader. I don’t blame anyone. But I can’t help thinking the Open Access model is better for the universal access to knowledge.

Photo credit: Sorry We’re Closed by Cinderella on Flickr (CC-by-nc-sa)

Aaron Swartz versus JSTOR

Boston Wiki Meetup Aaron Swartz, a 24-year old hacker, was recently indicted on data theft charges for downloading over 4 million documents from JSTOR, a US-based online system for archiving academic journals. Mainstream media (ReutersGuardianNYTTime, …) reported this with a mix of facts and fiction. I guess that the recent attacks of hacking groups on well-known websites and the release of data they stole on the internet gave to this story some spice.

First, I really appreciate what Aaron Swartz did and is currently doing. From The Open Libraryweb.py, RSS, to the Guerilla Open Access Manifesto and Demand Progress, he brought a lot to the computer world and the awareness of knowledge distribution.

Other blogs around the world are already talking about that and sometimes standing up for him. I especially liked The Economics of JSTOR (John Levin), The difference between Google and Aaron Swartz (Kevin Webb) and Careless language and poor analogies (Kevin Smith). I also encourage you to show your support for Aaron as I think he’s only the scapegoat for a bigger process …

I also think Aaron Swartz went too fast. If you do the maths (see appendix below), the download speed was approximately 49Mb per second. Even in a crowded network as the MIT one, this continuous amount of traffic coming from a single computer (or a few if you forge your addresses) is easily spotted. I understand he might have been in a hurry given that his access was not fully legal (although I think it initially was). It was the best thing to do if he wanted to collect a maximum amount of files in the shortest period of time.

This lead me to wonder what was the goal behind this act.

People stated it was his second attempt at downloading large amounts of data (which is not exactly true), depicting him like a serial perpetrator. Others stated that his motives were purely academic (text-mining research, JSTOR Data For Research being somewhat limited). One can also think of an act similar to Anonymous or LulzSec that were in the press recently. Or money, maybe (4*10^6 articles at an average of $15 per article makes $60 million), although this seems highly unlikely. The simple application of his Guerrilla Open Access Manifesto?

What is also puzzling me is the goal of JSTOR. It constantly repeats that it is supporting scholarly work and access to knowledge around the world. From its news statement, it says it was not its fault to prosecute Aaron Swartz but US Attorney’s Office’s. But at the same time, they assure they secured “the content” and made sure it will not be distributed. And the indictment doesn’t contain anything related to intellectual property theft. The only portion related to the content is a fraudulent access to “things of value”.

I think one of the issue JSTOR has is that it doesn’t actually own the material it sells to scientists. The actual publishers are dictating what JSTOR can digitize and what it can’t. And unfortunately, they only see these papers as “things of monetary value”.

However these things are actual scientific knowledge, usually from a distant past and usually without any copyright anymore. Except the cost of digitizing and building the search engine database (which are both  provided by Google Books and Google Scholar for free, or the Gutenberg project in another area), all the costs related to the dissemination of these papers are already covered, usually since a long time. The irony is that some of the papers behind the JSTOR paywall are sometimes even freely available elsewhere (at institutions’ and societies’ repositories, e.g.).

It wouldn’t have cost much to put all these articles under an Open Access license while transferring them to JSTOR. JSTOR would then charge for the actual digitizing work but wouldn’t have to “secure the content” in case of redistribution since it would then be allowed. The not–for–profit service provided by JSTOR would then benefit to the knowledge instead of being one additional roadblock to it.

JSTOR, don’t become the RIAA or the MPAA of old scholar content!

Appendix. The maths

In “retaliation”, Gregory Maxwell posted 32Gb of data containing 18,592 JSTOR articles on the internet. This is an average of 1.762Mb per JSTOR article. Aaron Swartz downloaded 4*10^6 articles from JSTOR that represents approximately 6.723Tb of data. That took him 4 days (September 25th, 26th and October 8th and 9th, 2010) at an average of 1,721.17Gb per day. If we assume the computer was working 10 hours per day (he has to plug and unplug the computer during working hours), the average download speed id 172Gb per hour or 2.869Gb per minute or 48.958Mb per second.

Photo credit: Boston Wiki Meetup by Sage Ross on Flickr (CC-by-sa).

About file formats accepted by BioMed Central

BioMed Central is one of the main Open Access publishers in the world of Science, Technology and Medicine. On a side note, that’s where I published my two articles (in Proteome Science and the Journal of Circadian Rhythms). One might think that, given their support to Open Access, they would also support Open Source software and Open Format documents.

For the software side, it’s not very clear. Although they ask authors to consider releasing software described in publications under a free (or at least open source) license, they also support and advertise for a bunch of proprietary software. While it’s not a bad thing per se (it enlarges the number of potential authors), it’s sad to see they don’t cite popular free software like OpenOffice.org (to write your article), Gimp (to edit your figures) or Zotero (for reference management). These are the three main software in each category but the free software world has many more of them!

I decided to write this post because I recently received an e-mail from BioMed Central stating that BMC Bioinformatics, one of their flagship publications, accepts a variety of different file formats in the submission process. This was already true when I submitted my articles. I wanted to know how they improved their submission process in this respect and if they now added open document formats (in a broad acceptance, not only the OpenDocument format somehow linked with OpenOffice.org).

E-mail from BMC Bioinformatics with file formats accepted for submission

My first comment is that the list of accepted file formats usually applies to all BioMed Central journals, not just BMC Bioinformatics, since they share the same publication platform. In the Instructions for Authors, the following file formats are accepted: Word, RTF and LaTeX (with the BMC template) for text, EPS, PDF, TIFF as well as PNG, Word (sic), PowerPoint (re-sic), JPEG and BMP for figures. In addition, they list CDX and TGF to represent chemical molecules. How disappointed am I!

I’m disappointed because some interesting open formats have been left out. And I can’t find interesting links stating that BioMed Central will support them soon.

With some stating that OpenOffice secured more than 15% of the business office suite market as of 2004 and despite an ISO standardisation (ISO/IEC 26300:2006), the OpenDocument formats are still absent. Many young scientists now use OpenOffice.org because it’s free (mainly free like in free beer, though), because labs can’t afford MS-Office licenses prices, even educational ones but also because it allow them to do everything they want. I agree that you can easily convert your ODF, ODS or ODP documents into their respective proprietary DOC, XLS and PPT. But it would have been nice to directly have the ODx documents. On the technical side, ODx documents are “just” XML files: tools exist to automatically parse them and transform them in the journal final format (I didn’t write it’s easy but it should be more easy than reverse-engineering closed, proprietary file formats).

I’m also disappointed because although the PowerPoint format if there, SVG is not. I guess it’s just because they only use bitmap versions of the PowerPoint files. All vector graphic editors supporting SVG (and all of them support SVG: Adobe Illustrator, Inkscape, Dia, …) have conversion functions to bitmap equivalent of your drawings. So it may have little impact. But it would have been better if BMC support for SVG was direct.

In conclusion, I’m hoping the extraordinary work done by BioMed Central in the publication area will extend to the formats they accept for submission. A partial example could come from PLoS submission guidelines (here for PLoS Computational Biology, especially for figures) where they explain a lot of technical as well as license aspects and cite free software as reference.

JoVE and (self-)archiving?

In my previous post, I was glad to see that the Journal of Visualized Experiments (JoVE) was now indexed by PubMed. I then spent some time watching some very interesting videos. And I realized that something is missing …

In my mind, I thought that third-party archiving (like arXiv or self-archiving) was one of the mandatory requirements for Open Access journals … and I was wrong. It seems JoVE is not giving the (technical) possibility to download the publication from their website (all what you can download is the abstract in text version). Now that this publication is a video and not a text/PDF version, it’s a problem for me (who cares?) and the Open Access movement (imho).

“Classical” Open Access journals are “just” an evolution of traditional, Closed Access journals (or rather a return to the original transmission form of scientific papers): usually, you can read the paper on the journal website but you can also download it and print it if you want (for offline reading or if you still prefer articles on paper). The problem with videos is that you can’t print them. Is it a sufficient reason to forbid the download of these videos?

Fortunately, there is a technical trick to allow you to download the video (it will still be in Flash 9 format but this problem is currently out of our scope). Once you are on the page of the interesting video (example), view its source code (Ctrl+U in Firefox) and look for the string “xml_file_name”. You can now copy the value of this variable and you can stop before the first “%26” you encounter ; for our example, we’ll copy this: “http://www.jove.com/projects/VideoChapterXML/default.aspx?VideoID=211”. Enter this in your address bar and you’ll get another (XML) file (hence the name). Now on the first line, you’ll get the URL of your video in Flash format (flv); in our example: “http://source.jove.com/164.flv”.

In the future I wonder if JoVE will include a link to download its videos or it will obfuscate its source code in order to forbid further download.

JoVE on PubMed

JoVE, the Journal of Visualized Experiments is a peer reviewed, open access, online journal devoted to the publication of biological research in a video format. Think of a YouTube-like service for the life-science community, add a quality control before publication and you’ll get the picture. As many other Open Access scientific journal, JoVE is now indexed in PubMed, the life-science publications directory. It’s nice to see interesting, open and innovative initiatives getting a “recognition” like this.

Thanks to Biosingularity for the info.

AEL-NG?

A few days ago, I was sad to see that the Association Electronique Libre (AEL) website was down and only replaced by two measly <html> tags. For those who didn’t know it:

The Association Electronique Libre is a belgian association protecting the fundamental rights in the information society.

The Association Electronique Libre supports the freedoms of speech, press, and association on the Internet and any electronical mediums, the right to use encryption software for private communication, the right to write software unimpeded by private monopolies, the right to access and preserve public domain and free digital information.
(from an old copy of the AEL website)

Although it was based in Belgium, the information it contained as well as the actions that were supported exceeded the small Belgian borders. The wiki was a very useful and valuable source of documents, links and comments about freedom in the electronic media. “Fortunately” we still have a 2007 version of the website on archive.org and some messages from the mailing-list were kepts on the mail-archive and open subscriber (and I will preciously keep my archives!).

Following a small exchange of e-mails with one of the main guy behind AEL, the machine hosting the AEL is simply dead (the fact the machine was dying was announced a long time ago, no one apparently reacted). I guess (or rather hope) that the data is still available on the hard disk(s).

Now what? Beside the fact we are all getting “older” with other priorities in life, how come we don’t feel more concerned about our freedom in the cyberspace? Internet liberties are still in danger [1], the Electronic Frontier Foundation website has more and more issues, a paper-media publishing house is printing comics to “educate teenage youth about an array of issues ranging from privacy, free software, security and the impact of politics on personal freedom as it relates to the use of technology”, … Are we too lazy to try to understand what’s behind Facebook, LinkedIn, Orkut, Ning and other “social networking websites“? Maybe the technological gap between these polished websites and what indivuals can do “in their garage” radically increased since the advent of so-called Web2.0, inhibiting our will to actively participate in it [2], to make it ours? Did most of us “surrender” in front of the razzle-dazzle aspects of new communication media?

The idea behind this post title (AEL – New Generation?) is simply that something should be done to bring back to life a central, hopefully community-driven website to gather information about our freedom in cyberspace …

[1] Ironically, in this post, this reference is written by the main person behind the AEL
[2] About the “creativity” of people in Web2.0 applications, we could read with interest this article from C. Jonckheere and F. Schreuer (unfortunately in French only)

One more Open Source software at ULg

ExamsAfter the promotion of Open Access (see Bernard Rentier’s blog) and a history of publications in Open Access journals (see this last article from the Cyclotron Research Center in PLoS), the University of Liege is slowly slowly publishing Open Source software too.

The last free software published is exams, an assessment management system (for on-line exams, …). They chose the GNU GPL 2, apparently without the possibility to upgrade to version 3 (I don’t know if it’s deliberate or not). And you can download the source code here.

What is even more interesting is that they provide a demonstration website if you want to test it in a nearly real setup (as examiners or students ; only in French). And the demonstration system is hosted by a commercial hosting company (OVH), indicating that it could be possible to use this system on very common platforms (only PHP/MySQL are required).

Now, we can dream of other software from the ULg released as free software, a subversion repository and a users/developers community around exams

P.S.: of course, we already did all that 😉 since we published Gemvid in an Open Access journal (the Journal of Circadian Rhythms) and published it along with a lot of other tools as free software. But I don’t count this as an institutional push towards free software since it was mainly my decision and the development didn’t involved other people.

Microsoft Research to sponsor Open Access awards

In a somewhat strange move, Microsoft Research is going to sponsor BioMed Central 2007 Research Awards.

Lee Dirks, director, scholarly communications, Microsoft Research: “We are very supportive of the open science movement and recognize that open access publication is an important component of overall scholarly communications.”

I hope the other Microsoft divisions are going to follow this move and sponsor (or release their products as) Open Source and free software projects … More details on the announcement here.