Skip to main navigation Skip to search Skip to main content

CRIM’s Speech Transcription and Call Sign Detection System for the ATC Airbus Challenge task

  • Computer Research Institute of Montreal

Research output: Contribution to journalConference articlepeer-review

5 Citations (Scopus)

Abstract

The Airbus air traffic control challenge evaluates speech recognition and call sign detection using real conversations between air traffic controllers and pilots at Toulouse airport in France. CRIM’s main contribution in acoustic modeling for transcribing these conversations is experimentation with bidirectional LSTM (BLSTM) models and lattice-free MMI (LF-MMI) trained TDNN models. Adapting these acoustic models trained from a large dataset to 40 hours of ATC acoustic training data reduces WER significantly compared to training them with the ATC data only. Multiple iterations of adaptation reduce WER for the BLSTM acoustic models significantly, but only marginally for the LF-MMI TDNN acoustic models. Constrained dialog between the air traffic controller and the pilot leads to language model perplexity below 12, and WER for leaderboard and evaluation sets of 9.98% and 9.41% respectively. For call sign detection from the decoded transcript, we use a bidirectional LSTM followed by conditional random field classifier. This DNN architecture worked better than a finite state transducer based call sign detection. Taking a majority vote over call signs from multiple decodes reduced the call sign errors. The best F1 for call sign detection for leaderboard was 0.8289 and for evaluation 0.8017. Overall, we came 3rd in this evaluation.

Original languageEnglish
Pages (from-to)3018-3022
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2019-September
DOIs
Publication statusPublished - 2019
Externally publishedYes
Event20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019 - Graz, Austria
Duration: 15 Sept 201919 Sept 2019

!!!Keywords

  • air traffic control automation
  • Bi-directional LSTM
  • call sign detection
  • Deep Neural Networks
  • DNN
  • TDNN

Cite this