Stephen E. Arnold: Averaging Data is Stupid

IO Impotency
Stephen E. Arnold

Averaging Information Is Not Cutting It Anymore

“Rescuing Collective Wisdom When The Average Group Opinion Is Wrong” is an article that pokes fun at the fanaticism running rampant in the news.  Beyond the fanaticism in the news, there is a real concern with averaging when it comes to data science and other fields that heavily rely on data.

The article breaks down the different ways averaging is used and the different theorems that are developed from it.  The introduction is a bit wordy but it sets the tone:

The total knowledge contained within a collective supersedes the knowledge of even its most intelligent member. Yet the collective knowledge will remain inaccessible to us unless we are able to find efficient knowledge aggregation methods that produce reliable decisions based on the behavior or opinions of the collective’s members. It is often stated that simple averaging of a pool of opinions is a good and in many cases the optimal way to extract knowledge from a crowd. The method of averaging has been applied to analysis of decision-making in very different fields, such as forecasting, collective animal behavior, individual psychology, and machine learning. Two mathematical theorems, Condorcet’s theorem and Jensen’s inequality, provide a general theoretical justification for the averaging procedure. Yet the necessary conditions which guarantee the applicability of these theorems are often not met in practice. Under such circumstances, averaging can lead to suboptimal and sometimes very poor performance.

Read full post.