Yet another 'Wisdom of Crowds' success
I was at the Federal Building downtown San Diego for a consulting job, and met some representatives for a life and disability insurance company who were giving away a big-screen HD TV for the individual who came closest to guessing the number of M&Ms (chocolate and peanut butter filled) in a container. Because they do this often, I won't show the specific container they use.
I offered to make a guess of the total, but only if I could see all of the guesses so far. I was drawing from the Wisdom of Crowds example from Chapter 1 of the book where a set of independent guesses tend to outperform even an expert's best guess. I've done the same experiment many times in data mining courses I've taught, and have found the same phenomenon.
I collected data from 77 individuals (including myself) shown here (sorted for convenience, but this makes no difference in the analysis):
Note there are a few flakey ones in the lot. The last two were easy to spot (so I put them at the bottom of my list). The idea of course is to just take the average of the guesses.
Average all: 4932
Average all without 37 and 187932: 2626
Then I looked at the histogram and decided that the guesses close to 10000 were also too flaky to include:
So I removed all data points greater than 8000, which took away 2 samples, leaving this histogram and a mean of 2436.
So now for the outcome:
Actual Count: 2464
Average of trimmed sample: 2436 (error 28)
Best individual guess: 2500 (error 36)
So amazingly, the average won, though I wouldn't have been disappointed at all if it finished 3rd or 4th because it still would have been a great guess.
Wisdom of Crowds wins again!
PS I reported to the insurance agents a guess of 2423 because I had omitted my original guess (provided before looking at any other guesses--2550 if you must know) and my co-worker's guess of 3250, so these helped bring up the mean a bit. The Average would have lost (barely) if I had not included them.
PPS So how will they split the winnings since two guessed the same value? I won't recommend the saw approach. I hope they ask each of the two guessers to either modify their guess, and require they modify their guess by at least one.
The moderated business community for business intelligence, predictive analytics, and data professionals.