Stephen E. Arnold: Improving Government Data Access – Avoid PDF, Provide CSV, Simplify Search

IO Impotency
Stephen E. Arnold
Stephen E. Arnold

Steps Offered to Improve Government Data Sites

The article on FlowingData titled How to Make Government Data Sites Better uses the Center for Disease Control website to illustrate measures the government should take to make their data more accessible and manageable. The first suggestion is to provide files in a useable format. By avoiding PDFs and providing CSV files (or even raw data), the user will be in a much better position to work with the data. Another suggestion is simply losing or simplifying the multipart form that makes search nearly impossible. The author also proposes clearer and more consistent annotation, using the following scenario to illustrate the point,

“The CDC data subdomain makes use of the Socrata Open Data API,… It’s weekly data that has been updated regularly for the past few months. There’s an RSS feed. There’s an API. There’s a lot to like… There’s also a lot of variables without much annotation or metadata … When you share data, tell people where the data is from, the methodology behind it, and how we should interpret it. At the very least, include a link to a report in the vicinity of the dataset.”

Overall, the author makes many salient points about transparency, consistency and clutter. But there is an assumption in the article that the government actually desires to make data sites better, which may be the larger question. If no one implements these ideas, perhaps that will be answer enough.

Chelsea Kerwin, July 08, 2014

Sponsored by, developer of Augmentext

Opt in for free daily update from this free blog. Separately The Steele Report ($11/mo) offers weekly text report and live webinar exclusive to paid subscribers, who can also ask questions of Robert. Or donate to ask questions directly of Robert.