Labeling Images with a Computer Game
Luis von Ahn and Laura Dabbish | CHI 2004
In a Nutshell š„
Von Ahn and Dabbish1 introduce an interactive game, the ESP Game, that harvests human intelligence to assign labels to images. The ESP Game is a two-player game. Players are shown an image prompt and their objective is to guess what their partner types for the image (see Figure 1).
Once the partners āagree on an imageā, they are awarded points and move on to the next image. By this mechanism, the partners effectively assign a label to the image, since their guesses represent a mutual agreement on something related to it.
The ESP Game has several additional features:
Taboo words. Taboo words are words that players cannot use as their guesses. They consist of words players commonly use as their guesses, obtained from the game itself. For example, the taboo words for Figure 1 may be ābagā, ābrownā, and āpurseā. By ruling out more general terms, this encourages the diversity of the playersā guesses and hence the diversity of the imageās labels.
Pre-recorded gameplay. If there are not enough players at a given time, a single player may play with a pre-recorded gameplay as their partner, facilitating asynchronous play. These are sequences of actions recorded from an earlier game session.
Theme rooms. Some images may require more context-specific labels. For example, given a piece of artwork, the average player may guess āpaintingā instead of the paintingās title, artist, or genre. An āartā theme room can allow interested players to play with only images associated with art, encouraging more context-specific labels.
Overall, the paper addresses the image-labeling problem by taking advantage of peoplesā desire to be entertained. The authors estimate that 5,000 people playing the game for 24 hours a day would allow all images indexed by Google at the time to be labeled within a few weeks.
Some Thoughts š
The ESP Game is an interesting approach to crowdsourcing. Rather than using monetary incentives, gamification allows them to recruit human workers at virtually zero cost.
Von Ahn continued to apply this concept of āgame with a purposeā to create the reCAPTCHA authentication system2, which Google uses to train its image recognition models, and the Duolingo language-learning app3, which uses the learnersā inputs to understand the nature of language and learning.
Nonetheless, I wonder what are the ethical implications of exploiting the addictive nature of games to essentially incentivize workers to work for free, especially for minors who are especially vulnerable4.
Von Ahn, L., & Dabbish, L. (2004, April). Labeling images with a computer game. InĀ Proceedings of the SIGCHI conference on Human factors in computing systemsĀ (pp. 319-326).
reCAPTCHA. (2021). Retrieved 28 February 2021, from https://www.google.com/recaptcha/about
Learn a language for free. (2021). Retrieved 28 February 2021, from https://www.duolingo.com
Chiu, S. I., Lee, J. Z., & Huang, D. H. (2004). Video game addiction in children and teenagers in Taiwan. CyberPsychology & behavior, 7(5), 571-581.