When A Crowd Isn’t a Crowd
Diversity is essential to animating the collective intelligence that emerges in models like the MATLAB contest, but the existence of diversity isn’t enough. It must also be maintained. Get enough people together—be it in a bar or a chat room—and a mysterious dynamic kicks in. People either accentuate their differences and polarize into opposing camps, or they downplay their differences altogether in order to reach a consensus. Both phenomena have the same net effect: the diversity within the crowd is diminished. Humans have evolved over many millennia into highly social creatures. In many circumstances, our ability to reach an amicable agreement meant the difference between life and death: “A mammoth is charging. Shall we run or poke him with our spears?” But when collective intelligence is in play, as it is in such crowdsourcing models as information markets and problem-solving networks, consensus is an undesirable outcome.
In 2004 James Surowiecki published The Wisdom of Crowds: Why the Many Are Smarter than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations. The title of Surowiecki’s book is a winking reference to Charles Mackay’s 1841 classic, Extraordinary Popular Delusions & the Madness of Crowds , a stern indictment of the herd mentality that lead to such disasters as the Dutch tulip mania. (Needless to say, the term crowdsourcing owes a debt to both authors.) While theories of group intelligence pre-dated Surowiecki’s book by decades, and in fact had recently come back into vogue in fields as disparate as sociology and business management, The Wisdom of Crowds captured the popular imagination in a way no other work on the subject ever had. The book contained an array of persuasive examples in which the crowd proved itself wiser than its smartest member. How did a crowd of fairgoers in rural England guess the weight of a prize-winning steer within one pound? How did a classroom of students guess the number of jelly beans in the jelly bean jar? How did the audience for the game show Who Wants to Be a Millionaire consistently beat the experts? Through the wisdom of crowds. Such anecdotes have acquired an almost magical patina, entering the collective imagination and becoming fodder for cocktail conversation and water cooler discussion. Unfortunately they were shorn of Surowiecki’s careful analysis.
In fact there’s no magic to the wisdom of crowds, and the expression itself is a bit misleading. In these examples the crowd was neither wise nor even functioning as a crowd, per se. A crowd implies a group of people acting as a unit, as in “the crowd broke through the barrier and descended upon the author in a fit of hysteria.” Okay, authors don’t generally inspire that degree of unchecked adoration, but you get the idea. The definition of “crowd” is “a group of people united by a common characteristic.” By contrast, collective intelligence is diminished by too many common characteristics. It flourishes in direct proportion to the amount of diversity contained within a group of people, and their ability to express their individual viewpoints. In order to be wise, so to speak, the crowd can’t act like a crowd at all.
There are other conditions that must be met for diversity to trump ability: First, it must be a real pickle of a problem. No one needs a diverse group of individuals to help them tie their shoes. Next, the crowd must have some qualifications to solve the problem at hand. A random collection of subway commuters could hardly be expected to outperform a group of nuclear engineers at designing a more efficient reactor—even Page’s brown socks were pulled from a faculty lounge, not the phone book. There must also be some method of aggregating and processing each individual’s contribution, such as the MATLAB contest’s scoring and ranking engine. But finally, participants must be drawn from a large enough pool to guarantee a diverse array of approaches and their ability to express their individuality—their “local knowledge”—must not be impaired.
Keeping all this in mind, let’s revisit some of those examples that seemed so counter-intuitive at first glance. Take the case of the jellybean jar. Any more or less random collection of students will differ in background and experience, and thus possess different types of “private information” or local knowledge. The aggregation mechanism, in this case, is simply the teacher’s ability to collect all the estimates and calculate an average. But crucially, the students are asked to write down their guesses without conferring with their neighbors, so they are able to think and act independently. (Gulley’s MATLAB contestants don’t confer so much as they steal from one another. Their relative isolation allows them to retain their diversity.)
Now let’s look again at a game show audience’s ability to accurately predict over 90 percent of the answers. On the game show in question, Who Wants to Be a Millionaire, contestants are asked a series of fifteen questions of increasing difficulty. If they answer all fifteen correctly they win $1 million. The questions are multiple-choice format, with four possible answers. When they get stuck, contestants ask for a “lifeline.” This means either calling a friend—presumably chosen for their encyclopedic knowledge—for help, or polling the audience. The “experts” perform admirably, getting the correct answer 65 percent of the time. But the audience does far better, guessing the correct answer 91 percent of the time.
This seems deeply impressive. It’s far better than all but the very best contestants do, and it would seem to provide ample evidence that the group is smarter than its smartest individual. But it’s actually just a function of simplest arithmetic, an illustration that if even a tiny number of individuals possess the correct answer, the group itself will predict accurately. This, Page writes, is because “the mistakes cancel one another out, and correct answers, like cream, rise to the surface .” This can be easily illustrated. Say the question posed to the audience, to use a real example from the show, is whether “Sherpas” and “Gurkhas” are native to (A) Nepal; (B) Morocco; (C) Equador; or (D) Russia. If only four percent of the audience knows that the correct answer is (A) Nepal, the rest of the audience can be expected to guess randomly among all four answers. The result is that Morocco, Equador and Russia will all receive 24 percent of the guesses, but Nepal will receive 28 percent, cluing our contestant to the correct answer.
And of course, there’s a big difference between guessing trivia questions and improving an algorithm by 1000 degrees of magnitude. The latter isn’t merely impressive—it defies belief. But in fact the same circumstances—diversity and the proper conditions in which to express it—are at work in both examples. On its face MATLAB would seem to attract mostly Mensa-quality programmers. In other words it’s a group that has self-selected according to its proficiency at solving such problems. A studio audience, on the other hand, is a randomly selected group.
It is certainly true that many of the most talented MATLAB programmers participate in the contest. But the best coders have generally all learned the same tricks and shortcuts from years of using the MATLAB computer language. It’s the inexperienced coders—the outsiders who have to come up with their own shortcuts—that make possible the giant cognitive leaps that allow the winning solution to improve on the initial solution by so many degrees of magnitude. If great minds think alike—and in many circumstances they do—then they really constitute only one mind. Or as Page puts it, “two heads aren’t better than one when it’s really only one head.” A diverse group of solvers results in many different approaches to a problem. How they apply this to real-world problems—far more involved than guiding a salesman through a set of cities—is the subject of our next chapter.