NEWS

Connected Digits Recognition


In the Connected Digits Recognition task, systems are required to recognize sequences of spoken Italian digits (numbers ranging from 0 to 9).

Two subtasks are defined, and applicants may choose to participate in any of them:

  • Clean digits: in this subtask, the test digits sequences are acquired in clean environment;
  • Noisy digits: in this subtask, the test digits sequences are acquired in noisy environment. The type of noise may vary from white noise to traffic, room, etc.

The evaluation is based on Minimum Edit Distance between the transcription coming out from the recognizer and the orthographic annotation. Accuracy will be calculated at word and phrase levels.

Training and development material extracted from wide-band (16kHz) corpora will be provided.


Task materials


Data Distribution NEW

  • The test data are now available. System results have to be sent by e-mail to gianpaolo.coro[at]abla.it, gretter[at]fbk.eu, and matasso[at]fbk.eu
  • Training data consist in 5348 sentences, 17505 digits; development data consist in 515 sentences, 3569 digits.
    Data are freely available and no fee will be required.
    Data can be downloaded at this address: http://evalita.fbk.eu/evalita2009srt.zip

Organizers
Gianpaolo Coro (ABLA, Milano), Roberto Gretter (FBK-irst, Trento) and Marco Matassoni (FBK-irst, Trento)