Stephen E. Arnold: Data Mining Cell Phones — and Public Data

IO Impotency
Stephen E. Arnold
Stephen E. Arnold

It Is About Time We Start Data Mining Mobile Phones

Posted: 23 May 2013 06:59 AM PDT

One of the main areas that companies are failing to collect data on is mobile phones. Interestingly enough, Technology Review has this article to offer the informed reader: “Released: A Trove Of Cell Of Cell Phone Data-Mining Research.” Cell phone data offers a plethora of opportunity, one that is only starting to be used to its full potential. It is not just the more developed countries that can use the data, but developing countries as well could benefit. It has been noted that cell phones could be used to redesign transportation networks and even create some eye-opening situations in epidemiology.

There is a global wide endeavor to understand cell phone data ramifications:

“Ahead of a conference on the topic that starts Wednesday at MIT, a mother lode of research has been made public about how to use this data. For the past year, researchers around the world responded to a challenge dubbed Data for Development, in which the telecom giant Orange released 2.5 billion records from five million cell-phone users in Ivory Coast. A compendium of this work is the D4D book, holding all 850 pages of the submissions. The larger conference, called NetMob (now in its third year), also features papers based on cell phone data from other regions, described in this book of abstracts.”

Before you get too excited, take note that privacy concerns are an important issue. No one has found a reasonable way to disassociate users with their cell phone data. It will only be a matter of time before that happens, until then we can abound in the possibilities.

Whitney Grace, May 28, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search

Harnessing The Power Of Raw Public Data

Posted: 27 May 2013 07:52 AM PDT

The Internet allows multiple data streams to converge and release their data to end users, but very few people know how to explicitly use the public data much less on how to find it. There is a solution reports TechCrunch in the article, “Enigma Makes Unearthing And Sifting Through Public Data A Breeze.” Enigma is a New York startup with Hicham Oudghiri, Marc Dacosta, and CEO Jeremy Bronfmann on the team. The company’s software pulls data from over 100,000 public data sources and it pools the data in easy-to-read tables.

“That’s all very neat, but how does Enigma do it? The data itself comes from a host of places, but most of Enigma’s government data was obtained by issuing a Freedom of Information Act request to the U.S. General Services Administration for all the top level .gov domains. From there the team uses crawlers to download all the databases it can find, and algorithmically finds connections between all those data points to create a sort of public knowledge graph. Whenever you search for a term on Enigma, Enigma actually searches around that term to figure out and display whatever applicable data sets it can find.”

Enigma should be seen more as an infrastructure search solution and the company heads believe it could become an integral part of the Internet in five years. As a tool, it has many benefits for researchers and already it has made partnerships with the New York Times, Capital IQ, S&P Capital, Gerson Lehrman Group, and the Harvard Business School. The startup company is an enterprise at the moment, but there are possible plans for a free version in the future. Enigma pulls all its data from public resources, but it must comply with laws and regulations that come with the information. Enigma wants to play by the rules, but by playing within the bounds it hopes to become a dispenseless tool.

Whitney Grace, May 28, 2013

Sponsored by ArnoldIT.com, developer of Beyond Search