Tag: Open Access

Open Access week: October 24-31, 2011

For once, I won’t write about a day here but about a week: this week is the Open Access week (OA week). In this fourth edition, it’s not time anymore to explain one more time what is Open Access (but if you still want to read about it, read the Wikipedia article or Peter Suber’s overview). This year, this week is defined as “an opportunity […] to continue to learn about the potential benefits of Open Access, to share what they’ve learned with colleagues, and to help inspire wider participation in helping to make Open Access a new norm in scholarship and research“.

I was curious about what is the state of Open Access in Belgian universities. On the OA week website, only two Belgian events were registered: one workshop centered around pending issues in the management of institutional repositories (organized by the national science funding body – FNRS) and one stream of activities at the University of Liege (yipee, ULg is my Alma Mater!). The new thing (at least for me) is that both event are either captured on video or launched with a video. The launch video from the University of Liege includes interviews with researchers telling how Open Access helps them and others (it’s a pity the entry page for its library network is still the same as ten years ago).

Paul Thirion about the OA week at the University of Liege
Paul Thirion about the OA week at the University of Liege

If you look for the information, you’ll find that the University of Ghent is also participating in the Open Access week, with two professors describing Open Access in videos in Dutch and a website about it (in collaboration with the ULg): http://www.openaccess.be.

Other Belgian universities usually support Open Access without any specific action about this week (except ULB with a recap mainly on the financial benefits). Is it a sign that Open Access is losing momentum or just became part of everyday life in universities?

An update on JoVE

Sorry We're Closed by bluecinderella on FlickrThree years ago, I wrote about JoVE, the Journal of Visualized Experiments. JoVE was a peer reviewed, open access, online journal devoted to the publication of biological research in a video format. I recently discovered that since 2009, JoVE is now just a peer reviewed, open access, online journal devoted to the publication of biological research in a video format. You can debate at length on whether JoVE was Open Access (as I thought) or not. I just think it’s sad although I understand their motives: in a recent exchange with them, they wrote they “handle most production of our content [themselves] and it is a very very costly operation”.

The recent exchange I had with Jove was about another previous post describing a way to store the videos locally, as anyone would do with Open Access articles in PDF format. I was unaware of two things:

  1. JoVE dropped the “Open Access” wording as I wrote above (however, there is still a possibility to publish a video in free access for a higher fee, as described as “Open access” in the About section for authors);
  2. the “trick” was still working (and people at JoVE seemed to be aware of that and I saw similar description of the trick elsewhere).

Unfortunately, this trick will not work anymore in the coming weeks since they will “do token authentication with [their] CDN“. JoVE will remain for me a very interesting journal with videos of quality and without any equivalent yet (SciVee doesn’t play in the same playground and I wonder why Research Explainer missed the comparison in their 2010 interview).

I was then wondering what could have been the impact of this decision on the number of videos published in JoVE as free access. I didn’t find any statistics related to this on the JoVE website (unrelated thought: I like the way BioMed Central gives access to its whole corpus). I then relied on PubMed to find all the indexed articles from JoVE and relied on its classification of “Free Full Text” (i.e. copied on the PubMed Central website, including the video). At the time of writing (August 2011), on a total of 1191 indexed articles, 404 are “Free Full Text”. This is nearly 34% of all JoVE articles. When you split this by year since 2006 (when JoVE went online), you obtain the following table and chart:

Year All articles Free Full Text articles Note
2006 18 18 Full free access
2007 127 127 Full free access
2008 115 87
2009 217 118 Introduction of Closed Access
2010 358 42
2011 356 12 So far (August 2011)
2011 534 18 Extrapolation to full year keeping the same proportion

Total number of articles and free full texts in JoVE

As we can see on the left chart, plotting the total number of articles in JoVE -vs- time, there is a steady increase in the number of articles since 2006. This tend to prove that more and more scientists enjoy publishing videos. It would be nice to have access to JoVE statistics in order to see if there is the same increase in the overall number of views of all videos. With “web 2.0” and broadband access in universities, I guess we would see this increase.

However, as we can see on the right chart, plotting the percentage of JoVE “Free Full Texts” in PubMed -vs- time, there is a dramatic decrease in the percentage of Free Full Texts in JoVE since 2008-2009. Less and less videos are published and available for free in PubMed Central. This is unfortunate for the reader without subscription. This may also be unfortunate for the publisher since there are less and less authors over time who pay the premium for free access. But since authors also pays for closed access, there is certainly a financial equilibrium.

Some methodological caveats … The PMC Free Full Texts are not necessarily in free access on the JoVE website (and vice-versa ; all the ones I checked are but I didn’t check all of them!). This might explain why there is already a reduction in Free Full Texts in PMC in 2008 while JoVE closed their journal in April 2009. I expected the same proportion of free articles published until the end of 2011 than in the beginning of 2011 ; this might not be the case (let’s see in January 2012 ; this also leads to the question: “is there a seasonal trend in publishing in JoVE?”).

What I take as a (obvious) message is that if authors can pay less for the same publication, they will, regardless of how accessible and affordable the publication will be for the reader. I don’t blame anyone. But I can’t help thinking the Open Access model is better for the universal access to knowledge.

Photo credit: Sorry We’re Closed by Cinderella on Flickr (CC-by-nc-sa)

Aaron Swartz versus JSTOR

Boston Wiki Meetup Aaron Swartz, a 24-year old hacker, was recently indicted on data theft charges for downloading over 4 million documents from JSTOR, a US-based online system for archiving academic journals. Mainstream media (ReutersGuardianNYTTime, …) reported this with a mix of facts and fiction. I guess that the recent attacks of hacking groups on well-known websites and the release of data they stole on the internet gave to this story some spice.

First, I really appreciate what Aaron Swartz did and is currently doing. From The Open Libraryweb.py, RSS, to the Guerilla Open Access Manifesto and Demand Progress, he brought a lot to the computer world and the awareness of knowledge distribution.

Other blogs around the world are already talking about that and sometimes standing up for him. I especially liked The Economics of JSTOR (John Levin), The difference between Google and Aaron Swartz (Kevin Webb) and Careless language and poor analogies (Kevin Smith). I also encourage you to show your support for Aaron as I think he’s only the scapegoat for a bigger process …

I also think Aaron Swartz went too fast. If you do the maths (see appendix below), the download speed was approximately 49Mb per second. Even in a crowded network as the MIT one, this continuous amount of traffic coming from a single computer (or a few if you forge your addresses) is easily spotted. I understand he might have been in a hurry given that his access was not fully legal (although I think it initially was). It was the best thing to do if he wanted to collect a maximum amount of files in the shortest period of time.

This lead me to wonder what was the goal behind this act.

People stated it was his second attempt at downloading large amounts of data (which is not exactly true), depicting him like a serial perpetrator. Others stated that his motives were purely academic (text-mining research, JSTOR Data For Research being somewhat limited). One can also think of an act similar to Anonymous or LulzSec that were in the press recently. Or money, maybe (4*10^6 articles at an average of $15 per article makes $60 million), although this seems highly unlikely. The simple application of his Guerrilla Open Access Manifesto?

What is also puzzling me is the goal of JSTOR. It constantly repeats that it is supporting scholarly work and access to knowledge around the world. From its news statement, it says it was not its fault to prosecute Aaron Swartz but US Attorney’s Office’s. But at the same time, they assure they secured “the content” and made sure it will not be distributed. And the indictment doesn’t contain anything related to intellectual property theft. The only portion related to the content is a fraudulent access to “things of value”.

I think one of the issue JSTOR has is that it doesn’t actually own the material it sells to scientists. The actual publishers are dictating what JSTOR can digitize and what it can’t. And unfortunately, they only see these papers as “things of monetary value”.

However these things are actual scientific knowledge, usually from a distant past and usually without any copyright anymore. Except the cost of digitizing and building the search engine database (which are both  provided by Google Books and Google Scholar for free, or the Gutenberg project in another area), all the costs related to the dissemination of these papers are already covered, usually since a long time. The irony is that some of the papers behind the JSTOR paywall are sometimes even freely available elsewhere (at institutions’ and societies’ repositories, e.g.).

It wouldn’t have cost much to put all these articles under an Open Access license while transferring them to JSTOR. JSTOR would then charge for the actual digitizing work but wouldn’t have to “secure the content” in case of redistribution since it would then be allowed. The not–for–profit service provided by JSTOR would then benefit to the knowledge instead of being one additional roadblock to it.

JSTOR, don’t become the RIAA or the MPAA of old scholar content!

Appendix. The maths

In “retaliation”, Gregory Maxwell posted 32Gb of data containing 18,592 JSTOR articles on the internet. This is an average of 1.762Mb per JSTOR article. Aaron Swartz downloaded 4*10^6 articles from JSTOR that represents approximately 6.723Tb of data. That took him 4 days (September 25th, 26th and October 8th and 9th, 2010) at an average of 1,721.17Gb per day. If we assume the computer was working 10 hours per day (he has to plug and unplug the computer during working hours), the average download speed id 172Gb per hour or 2.869Gb per minute or 48.958Mb per second.

Photo credit: Boston Wiki Meetup by Sage Ross on Flickr (CC-by-sa).