Neal Rauhauser: Global Database of Events, Language & Tone (GDELT) Is SAFE!

Data
Neal Rauhauser
Neal Rauhauser

Global Database of Events, Language & Tone (GDELT) Is SAFE!
by Neal Rauhauser

That was a long, uncomfortable silence, after I posted GDELT’s Mysterious Demise, but we now have the particulars on what happened:

The bottom line is that GDELT is one of the very few event datasets in existence today that actually has all of the necessary permissions.

The concerns that have recently been discussed were raised by two faculty members at the University of Illinois and were examined by a panel of faculty experts convened by the University of Illinois’ Vice Chancellor for Research. That panel formally cleared GDELT on behalf of that office stating “the Panel finds that it was not able to conclude that GDELT is founded on misappropriated … data or software.” With respect to concerns raised regarding the open source TABARI software that GDELT makes use of to create its CAMEO event records, the same panel explored concerns raised regarding its ownership and similarly found that “TABARI … has well known antecedents at another institution dating back to at least 2000 and therefore is not attributable to the [University of Illinois]“. While this whole situation would have been easily avoided with just a little communication and avoided a lot of unnecessary angst, the silver lining is that it has demonstrated just how widely-used and important GDELT has really become over the past year and we are tremendously excited to work with all of you in 2014 to really explore the future of “big data” study of human society.

I thought there might be a problem with either the underlying data or the software used, turns out that both issues were raised by the University of Illinois professors who parted ways with the project.

This feels a bit like the USL vs. BSDi lawsuit, which freed unix from AT&T’s clutches twenty years ago. A big, important datasource is now out in the open in such a way that it can not be put back. I have some financial records digging to do in the coming week, the Montgomery County Council will remain a priority until the primary is over, but I am itching to wrestle the GDELT feed into some format I can personally use.