20 Jul 2011

Internet Activist Aaron Swartz Indicted for Data Theft: Downloading Millions of Academic Articles



For a long time, it was the folks who downloaded music or movies illegally that faced the wrath of government prosecutors. So the unsealing of an indictment today against Aaron Swartz, former Reddit-er and founder of Demand Progress, for the illegal download of some 4 million-odd academic journal articles may sound a bit unusual.
Demand Progress has issued a statement suggesting Swartz's actions were akin to "checking too many books out of the library." But the government clearly disagrees as the charges include wire fraud, computer fraud, and unlawfully obtaining information from a protected computer. Schwartz now faces up to 35 years in prison and up to $1 million in fines. See How and Why he did it?


How He Did It

The indictment (a full copy is here) details Schwartz's purchase of a laptop, which he used to "systematically access and rapidly download an extraordinary volume of articles from JSTOR." JSTOR is an online database of academic journals. It provides the full texts of digitized journals, with back issues for some of the most popular ones dating back hundreds of years. A non-profit organization, JSTOR offers its service to primarily academic libraries, who in turn make the content available to their patrons
In a statement today, JSTOR says that last fall and winter it "experienced a significant misuse of our database. A substantial portion of our publisher partners' content was downloaded in an unauthorized fashion using the network at the Massachusetts Institute of Technology, one of our participating institutions. The content taken was systematically downloaded using an approach designed to avoid detection by our monitoring systems."
The indictment details that how Schwartz did just that, from the purchase of the laptop to the creation of ghost accounts on the MIT network, to the break-in of a wiring closet where Swartz had his equipment stored.

Why He Did It

Why would Aaron Swartz want 4 million academic journal articles? Blogger Jason Kottke says "it's not too difficult to guess," and points to Swartz's earlier efforts to download and distribute files from Pacer the government-run Public Access to Court Electronic Records system. When Pacer was opened to a limited number of libraries, Swartz among others, the New York Times reported, tried to "download as many court documents as they could, and send them to him for republication on the Web, where Google could get to them."
It's not clear if this is what Swartz had in mind by copying the JSTOR database: "liberating," if you will, the journal articles for more open consumption. But in its statement, JSTOR says that it had already reached an agreement with Swartz and had "received confirmation that the content was not and would not be used, copied, transferred, or distributed."
Whatever the intention, the U.S. Attorney for the District of Massachusetts makes clear the government's position: "Stealing is stealing whether you use a computer command or a crowbar, and whether you take documents, data or dollars." Even though it appears as though JSTOR was not interested in pressing charges (it has declined to comment specifically about that), the government has leveled some serious felony charges against Swartz.
But rest assured scholars everywhere, even though Swartz allegedly "stole" 4 million journal articles, they're still all available in JSTOR.
In court in Boston today, Swartz plead not-guilty on all counts. His next court date is set for September 9.

Filled Under:

0 comments:

Post a Comment