When A Crowd Isn’t a Crowd
Diversity is essential to animating the collective intelligence that emerges in models like the MATLAB contest, but the existence of diversity isn’t enough. It must also be maintained. Get enough people together—be it in a bar or a chat room—and a mysterious dynamic kicks in. People either accentuate their differences and polarize into opposing camps, or they downplay their differences altogether in order to reach a consensus. Both phenomena have the same net effect: the diversity within the crowd is diminished. Humans have evolved over many millennia into highly social creatures. In many circumstances, our ability to reach an amicable agreement meant the difference between life and death: “A mammoth is charging. Shall we run or poke him with our spears?” But when collective intelligence is in play, as it is in such crowdsourcing models as information markets and problem-solving networks, consensus is an undesirable outcome.
In 2004 James Surowiecki published The Wisdom of Crowds: Why the Many Are Smarter than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations. The title of Surowiecki’s book is a winking reference to Charles Mackay’s 1841 classic, Extraordinary Popular Delusions & the Madness of Crowds , a stern indictment of the herd mentality that lead to such disasters as the Dutch tulip mania. (Needless to say, the term crowdsourcing owes a debt to both authors.) While theories of group intelligence pre-dated Surowiecki’s book by decades, and in fact had recently come back into vogue in fields as disparate as sociology and business management, The Wisdom of Crowds captured the popular imagination in a way no other work on the subject ever had. The book contained an array of persuasive examples in which the crowd proved itself wiser than its smartest member. How did a crowd of fairgoers in rural England guess the weight of a prize-winning steer within one pound? How did a classroom of students guess the number of jelly beans in the jelly bean jar? How did the audience for the game show Who Wants to Be a Millionaire consistently beat the experts? Through the wisdom of crowds. Such anecdotes have acquired an almost magical patina, entering the collective imagination and becoming fodder for cocktail conversation and water cooler discussion. Unfortunately they were shorn of Surowiecki’s careful analysis.
In fact there’s no magic to the wisdom of crowds, and the expression itself is a bit misleading. In these examples the crowd was neither wise nor even functioning as a crowd, per se. A crowd implies a group of people acting as a unit, as in “the crowd broke through the barrier and descended upon the author in a fit of hysteria.” Okay, authors don’t generally inspire that degree of unchecked adoration, but you get the idea. The definition of “crowd” is “a group of people united by a common characteristic.” By contrast, collective intelligence is diminished by too many common characteristics. It flourishes in direct proportion to the amount of diversity contained within a group of people, and their ability to express their individual viewpoints. In order to be wise, so to speak, the crowd can’t act like a crowd at all.
There are other conditions that must be met for diversity to trump ability: First, it must be a real pickle of a problem. No one needs a diverse group of individuals to help them tie their shoes. Next, the crowd must have some qualifications to solve the problem at hand. A random collection of subway commuters could hardly be expected to outperform a group of nuclear engineers at designing a more efficient reactor—even Page’s brown socks were pulled from a faculty lounge, not the phone book. There must also be some method of aggregating and processing each individual’s contribution, such as the MATLAB contest’s scoring and ranking engine. But finally, participants must be drawn from a large enough pool to guarantee a diverse array of approaches and their ability to express their individuality—their “local knowledge”—must not be impaired.
Keeping all this in mind, let’s revisit some of those examples that seemed so counter-intuitive at first glance. Take the case of the jellybean jar. Any more or less random collection of students will differ in background and experience, and thus possess different types of “private information” or local knowledge. The aggregation mechanism, in this case, is simply the teacher’s ability to collect all the estimates and calculate an average. But crucially, the students are asked to write down their guesses without conferring with their neighbors, so they are able to think and act independently. (Gulley’s MATLAB contestants don’t confer so much as they steal from one another. Their relative isolation allows them to retain their diversity.)
Now let’s look again at a game show audience’s ability to accurately predict over 90 percent of the answers. On the game show in question, Who Wants to Be a Millionaire, contestants are asked a series of fifteen questions of increasing difficulty. If they answer all fifteen correctly they win $1 million. The questions are multiple-choice format, with four possible answers. When they get stuck, contestants ask for a “lifeline.” This means either calling a friend—presumably chosen for their encyclopedic knowledge—for help, or polling the audience. The “experts” perform admirably, getting the correct answer 65 percent of the time. But the audience does far better, guessing the correct answer 91 percent of the time.
This seems deeply impressive. It’s far better than all but the very best contestants do, and it would seem to provide ample evidence that the group is smarter than its smartest individual. But it’s actually just a function of simplest arithmetic, an illustration that if even a tiny number of individuals possess the correct answer, the group itself will predict accurately. This, Page writes, is because “the mistakes cancel one another out, and correct answers, like cream, rise to the surface .” This can be easily illustrated. Say the question posed to the audience, to use a real example from the show, is whether “Sherpas” and “Gurkhas” are native to (A) Nepal; (B) Morocco; (C) Equador; or (D) Russia. If only four percent of the audience knows that the correct answer is (A) Nepal, the rest of the audience can be expected to guess randomly among all four answers. The result is that Morocco, Equador and Russia will all receive 24 percent of the guesses, but Nepal will receive 28 percent, cluing our contestant to the correct answer.
And of course, there’s a big difference between guessing trivia questions and improving an algorithm by 1000 degrees of magnitude. The latter isn’t merely impressive—it defies belief. But in fact the same circumstances—diversity and the proper conditions in which to express it—are at work in both examples. On its face MATLAB would seem to attract mostly Mensa-quality programmers. In other words it’s a group that has self-selected according to its proficiency at solving such problems. A studio audience, on the other hand, is a randomly selected group.
It is certainly true that many of the most talented MATLAB programmers participate in the contest. But the best coders have generally all learned the same tricks and shortcuts from years of using the MATLAB computer language. It’s the inexperienced coders—the outsiders who have to come up with their own shortcuts—that make possible the giant cognitive leaps that allow the winning solution to improve on the initial solution by so many degrees of magnitude. If great minds think alike—and in many circumstances they do—then they really constitute only one mind. Or as Page puts it, “two heads aren’t better than one when it’s really only one head.” A diverse group of solvers results in many different approaches to a problem. How they apply this to real-world problems—far more involved than guiding a salesman through a set of cities—is the subject of our next chapter.


Good post, Jeff. But I'm not sure I agree on the difference in importance between the MatLab-case and the AskTheAudience-case. While it would seem that the former can create differences of a much greater magnitude, the latter clearly delivers a usable level of difference/accuracy at much greater speeds. What I miss here is a discussion about how this could be harnessed and utilized. Some of the existing crowdsourcing-plays clearly does this, I'm thinking in particular about cases where human agents have been used to verify images or aid in scanning results. We're often overly impressed by huge improvements, and miss out on the fact that incremental innovation is the real lifesaver for companies.
Hmm, just thought of something. Couldn't Toyota's impressive innovation policy be read as a kind of internal crowdsourcing? See http://www.newyorker.com/talk/financial/2008/05/12/080512ta_talk_surowiecki
Posted by: | May 14, 2008 at 12:15 AM
As a long time MATLAB programmer and fan, let me first say I appreciate your using that example. I think there *is* a big big difference. I could write down a mathematical model of the jellybean or millionaire example in 5 minutes (they are mostly monte carlo type randomized algorithms with some assumptions that will drive reasonable convergence properties), but the Matlab one is a complex social phenomenon. I haven't participated in the contests, but basically the same thing happened to me on a project team, where I spent a couple of days coding a solution to a problem, which seemed perfect, but didn't work because of an obscure bug. When I finally gave up and quit, my friend and project team mate spent just an hour or so on my code and found the bug I'd missed. It was a single misplaced apostrophe :) The code ran beautifully after that.
Modeling the 'many eyeballs/shallow bugs' is hard. There may be things hidden there that get to the 'smarter in practice than in theory' phenomenon. But in general, the explanation usually given for 'unreasonable effectiveness' effects (such as in neural networks) is a combination of fundamental properties (NNs are universal function/patter recognizers) along with the 'future is like the past' statistical assumption (or similar ergodicity conditions). This works for crowds too.
And btw, you're going to have to work harder to get comments :) My current ratio on my blog is hovering around 2.5 per post for about 80 posts, and I know I put in a LOT of thought and effort to drive up the commenting culture of my readers.
Posted by: Venkat | May 14, 2008 at 03:41 AM
We've been talking a lot recently with Tommi Vilkamo, the head of Nokia Beta Labs, about in what situations prediction markets actually work. I am sure there will be a large number of executives wanting to try this on something it might not suite. I'd like to therefore point out some of our thoughts.
There are "problems" where the solution information is too unevenly spread within the population for prediction to work. In such situations, an "open market" approach produces wrong conclusions, since it will be biased with the opinion that is available to majority, but which does not represent the whole picture. If a piece of critical information is only available to a handful, the prediction will be biased but narrowing the "crowd" too much on the other hand would lose its crowd-characteristic.
For example, imagine a company wanting to predict which of four possible new product concepts will sell most. In such a situation, if the consumer is let to predict, the result will only reflect mass opinion (e.g. "what would be coolest"). What if certain product has a major flaw that is only known to the development team? Even if it was a big company and the prediction was done in-house, the key information might be too unevenly spread.
Another thing that I would like to point out about crowd-problem-solving is, that when the quality of a solution can be instantly measured - as in the MATLAB-example - having a crowd to work on it is extremely powerful. Instead, if the delay of the feedback on a given possible improvement would be significant in contrast to the time span of the whole problem, the crowd loses much of its efficiency. In such cases it would be difficult to identify the current best candidate, which then would spread the resources on different solutions. Also, trial-and-error would not function, so that would leave out much of the script kiddies contributions ;)
Brilliant blog, will drop by frequently!
Posted by: Ilkka Peltola | May 20, 2008 at 04:48 AM
Another thing that I would like to point out about crowd-problem-solving is, that when the quality of a solution can be instantly measured - as in the MATLAB-example - having a crowd to work on it is extremely powerful. Instead, if the delay of the feedback on a given possible improvement would be significant in contrast to the time span of the whole problem, the crowd loses much of its efficiency. In such cases it would be difficult to identify the current best candidate, which then would spread the resources on different solutions. Also, trial-and-error would not function, so that would leave out much of the script kiddies contributions ;)
Posted by: kraloyun | December 07, 2009 at 09:31 AM
You have to believe in yourself . That's the secret of success .
Posted by: new balance | October 15, 2010 at 12:46 AM
The leader maybe will tell us how to do it.
Posted by: 50mw laser | November 11, 2010 at 12:17 AM
Thank you for introducing me the wonderful information.And .....Totally boring.!
Posted by: Health News | March 19, 2011 at 12:38 AM
Thank you for taking the time and spreading this information with us all. It was indeed very helpful and informative while being straight forward and to the point.
Posted by: Turbo fire | May 16, 2011 at 12:35 AM
We have gotten many great comments from our customers and earn a good reputation in foreign makerts, more than 90% customers are satisfied with our products and service, till now our online members are beyond 80,000. As of right now, we currently serve customers from over 18 countries, and we are still growing. We really hope to expand our business through cooperation with individuals and companies from around the world.
Posted by: chaussures femmes | August 29, 2011 at 05:03 AM
Sie erklärt seine / ihre Vergehen könnte von Melanom gewesen sein, das brandneue mageres Fleisch nicht zu tun Arbeit, sowie Schwierigkeiten bei immun-unterdrückenden Medikamente zu umgehen Penis abgelehnt wird., [url=http://www.uggkaufen.net/damen-stiefel-wholesale-34.html]Damen Stiefel[/url] kann, ANY mageres Fleisch Transplantation Behandlung Jobs 'Form des Melanoms, doch "wenn die Idee wurden andere, es ist wirklich in der Regel in einem in der Lage sein zu zwei Jahren", erklärte Dr. Jordan Pishvaian, jeder, [url=http://www.uggkaufen.net/]ugg günstige[/url] , Magen-Darm-Krebs Tumor-Experte an der Georgetown University Lombardi Komplette Malignancy Centre. Karriere berichtet dieser Person erschien kurz nach chirurgischer Behandlung mit 2008 zum Inselchen zellulären neuroendokrinen Tumor geheilt werden,
Posted by: Damen Stiefel | October 07, 2011 at 06:04 PM
私の配偶者と私は本当にこれはあなたの犬を見たとき私たちは、件名を変更するように努めなければ親密さに関する私の自宅で話をするつもりだった、 [url=http://www.ugg-rakuten.com/]ブーツ通販[/url]出す前に、、彼女は特別な、あなたのプログラムの製造元には、展示の基準で報告。書かれた今ノックス、24日:"私はあなたの犬は、個別に私の家族を理解するために、不完全に答えた彼は私達によって応答または多くの細部を購入するように努めたおそらく受信しないために使用される唯一の私が個人的に確認するために、診断テストがあったことを認識。。シナリオの深刻さ。"、 [url=http://www.ugg-rakuten.com/ugg-bailey-button-triplet-ugg-5_8.html]UGGベイリーボタントリプレット[/url] 、イギリス諸島大学生メレディスカーチャーで排除ぞっとするような07にこれらをacquitting一度土曜日解放されたノックスとも女性のイタリアの元ボーイフレンドラファSollecitoあるペルージャ周辺NEW法廷、容疑者はすぐにあらゆる暴力的な性交の冒険の後。 、という事実にもかかわらず、いくつかのタブロイド紙の雑誌を"フォクシーKnoxy"と呼ばれた
Posted by: クラシックアーガイルニットuggの | October 09, 2011 at 06:41 PM
I never read so interesting posts,thank your sharing
Posted by: Cheap Football jerseys | November 08, 2011 at 06:18 PM