Atlantic, 16 December 2011
A new program can find and compare relationships in complicated data without having to be asked specific queries
Are there subtle patterns lurking in data that can foretell of a coming financial-system crash? What can explain the variations in sports-star salaries? How about the complex relationship between genes and certain diseases? Scientists in various fields have been searching for better ways to analyze large piles of data for such patterns, but the difficulty has always been that they need to know what they're looking for in order to find. A new software program, described in the latest issue of Science, is designed to find the patterns in data that scientists don't know to look for.
David Reshef, one of the scientists behind MINE, as the program is called, explains, “Standard methods will see one pattern as signal and others as noise. There can potentially be a variety of different types of relationships in a given data set. What's exciting about our method is that it looks for any type of clear structure within the data, attempting to find all of them. … This ability to search for patterns in an equitable way offers tremendous exploratory potential in terms of searching for patterns without having to know ahead of time what to search for.” MINE compares different possible relationships (including linear, exponential, and periodic) and returns those that are strongest.
On MINE's website, the program is available for download.