Dante - Di Michelino 150° sponsors

Corporate & Society Sponsors
Loquendo diamond package
Nuance gold package
ATT bronze package
Google silver package
Appen bronze package
Appen bronze package
Interactive Media bronze package
Microasoft bronze package
SpeechOcean bronze package
Avios logo package
NDI logo package
NDI logo package


Universitč de Avignon
Speech Cycle
Universitā di Firenze
Univ. Trento
Univ. Napoli
Univ. Tuscia
Univ. Calabria
Univ. Venezia


Comune di Firenze
Firenze Fiera
Florence Convention Bureau


12thAnnual Conference of the
International Speech Communication Association


Interspeech 2011 Florence

Special Sessions


Crowdsourcing for speech processing

Sun-Ses3-S1-O - oral
Sun-Ses3-S1-P - poster

As the amount of speech data that is used in research and commercial applications has risen, so has the issue of how to obtain this data, how to transcribe it and how to assess the quality of systems that use speech. Until recently the solution to this has been expensive and lengthy. Experts have, for example, either annotated data by themselves or trained groups of people to do the task. It is costly to pay them and the combination of the training process and the subsequent throughput has added a considerable delay to system development. Recently, in answer to this, several researchers have turned to crowdsourcing, where non-experts perform tasks in exchange for a certain incentive. The literature in this area (http://sites.google.com/site/amtworkshop2010/ ) shows that, when used properly, this approach can produce results that are comparable to what is obtained from experts in less time and more inexpensively. Typically, crowdsourcing involves asking several workers to perform very small tasks (like labelling one sentence) and remunerating them with small amounts (like $.05 per task) for this. With access to a multitude of workers, the tasks are accomplished very quickly. Typical sources of workers, although certainly not the only ones, are Amazon Mechanical Turk and CrowdFlower. This session will examine the breadth of work in this area and include papers about the use of crowdsourcing for speech processing. The following is a list of possible areas, although papers in other related areas are welcome:
  • data acquisition,
  • speech labeling,
  • assessment and evaluation,
  • user studies involving speech.

While the sources of workers are not yet available in some countries, several efforts have been started to create local groups of workers and we welcome researchers who can describe their efforts to set up such a source.

We expect that this session will not only interest those who have used crowdsourcing and want to show their findings, but also those who are curious about how crowdsourcing can be useful for their speech processing needs.


Maxine Eskenazi, Carnegie Mellon University (max@cmu.edu) has used crowdsourcing for the transcription of large amounts of speech data. She is Principal Systems Scientist in the Language Technologies Institute at Carnegie Mellon and is Director of the Dialog Research Center.

Helen Meng, The Chinese University of Hong Kong. (hmmeng@se.cuhk.edu.hk) has used crowdsourcing in spoken dialog systems evaluation. She is Professor in The Chinese University of Hong Kong and Director of the CUHK-MoE-Microsoft Key Laboratory on Human-centric Computing and Interface Technologies.

David Suendermann, SpeechCycle, Inc. (david@speechcycle.com) has performed research into transcription and semantic annotation of tens of millions of utterances in commercial spoken dialog systems using crowdsoucing techniques. David is the Principal Speech Scientist of SpeechCycle.

Gina Levow, University of Washington (levow@u.washington.edu) has worked in the area of spoken language processing for more than 15 years. She received her Ph.D. in Computer Science from M.I.T and has since pursued research at institutions including the University of Maryland (College Park), the University of Chicago, and the University of Manchester and has published more than 50 peer-reviewed papers. She is currently an Assistant Professor at the University of Washington.