Acoustic word embeddings and neural segmental models

20/12/2016 - 12:00

This talk covers two closely related lines of work.  

For a number of speech tasks, it can be useful to represent speech segments of arbitrary length by fixed-dimensional vectors, or embeddings.  In particular, vectors representing word segments -- acoustic word embeddings -- can be used in query-by-example tasks, example-based speech recognition, or spoken term discovery.  *Textual* word embeddings have been common in natural language processing for a number of years now; the acoustic analogue is only recently starting to be explored.  This talk will present our work on acoustic word embeddings, including a variety of models in unsupervised, weakly supervised, and supervised settings.