Integration framework for speech processing with live visualization interfaces

  • David Brodeur
  • , Francois Grondin
  • , Yazid Attabi
  • , Pierre Dumouchel
  • , Francois Michaud

Research output: Contribution to Book/Report typesContribution to conference proceedingspeer-review

2 Citations (Scopus)

Abstract

Audition is a rich source of spatial, identity, linguistic and paralinguistic information. Processing all this information requires acquisition, processing and interpretation of sound sources, which are instantaneous, invisible and noisy signals. This can lead to different responses by the system in relation to the information perceived. This paper presents our first implementation of an integration framework for speech processing. Acquisition includes sound capture, sound source localization, tracking, separation and enhancement, and voice activity detection. Processing involves speech and emotion recognition. Interpretation consists of translating speech utterances into commands that can influence interaction through dialogue management and speech synthesis. The paper also describes two visualization interfaces, inspired by comic strips, to represent live vocal interactions in real life environments. These interfaces are used to demonstrate how the framework performs in live interactions and its use in a usability study.

Original languageEnglish
Title of host publication25th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages144-150
Number of pages7
ISBN (Electronic)9781509039296
DOIs
Publication statusPublished - 15 Nov 2016
Event25th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2016 - New York, United States
Duration: 26 Aug 201631 Aug 2016

Publication series

Name25th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2016

Conference

Conference25th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2016
Country/TerritoryUnited States
CityNew York
Period26/08/1631/08/16

Fingerprint

Dive into the research topics of 'Integration framework for speech processing with live visualization interfaces'. These topics are generated from the title and abstract of the publication. Together, they form a unique fingerprint.

Cite this