Integration framework for speech processing with live visualization interfaces

  • David Brodeur
  • , Francois Grondin
  • , Yazid Attabi
  • , Pierre Dumouchel
  • , Francois Michaud

Résultats de recherche: Chapitre dans un livre, rapport, actes de conférenceParticipation à un ouvrage collectif lié à un colloque ou une conférenceRevue par des pairs

2 Citations (Scopus)

Résumé

Audition is a rich source of spatial, identity, linguistic and paralinguistic information. Processing all this information requires acquisition, processing and interpretation of sound sources, which are instantaneous, invisible and noisy signals. This can lead to different responses by the system in relation to the information perceived. This paper presents our first implementation of an integration framework for speech processing. Acquisition includes sound capture, sound source localization, tracking, separation and enhancement, and voice activity detection. Processing involves speech and emotion recognition. Interpretation consists of translating speech utterances into commands that can influence interaction through dialogue management and speech synthesis. The paper also describes two visualization interfaces, inspired by comic strips, to represent live vocal interactions in real life environments. These interfaces are used to demonstrate how the framework performs in live interactions and its use in a usability study.

langue originaleAnglais
titre25th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2016
EditeurInstitute of Electrical and Electronics Engineers Inc.
Pages144-150
Nombre de pages7
ISBN (Electronique)9781509039296
Les DOIs
étatPublié - 15 nov. 2016
Evénement25th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2016 - New York, Etats-Unis
Durée: 26 août 201631 août 2016

Série de publications

Nom25th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2016

Conférence

Conférence25th IEEE International Symposium on Robot and Human Interactive Communication, RO-MAN 2016
Pays/TerritoireEtats-Unis
La villeNew York
période26/08/1631/08/16

Empreinte digitale

Voici les principaux termes ou expressions associés à « Integration framework for speech processing with live visualization interfaces ». Ces libellés thématiques sont générés à partir du titre et du résumé de la publication. Ensemble, ils forment une empreinte digitale unique.

Contient cette citation