Acoustic word embeddings and neural segmental models
This talk covers two closely related lines of work.
For a number of speech tasks, it can be useful to represent speech segments of arbitrary length by fixed-dimensional vectors, or embeddings. In particular, vectors representing word segments -- acoustic word embeddings -- can be used in query-by-example tasks, example-based speech recognition, or spoken term discovery. *Textual* word embeddings have been common in natural language processing for a number of years now; the acoustic analogue is only recently starting to be explored. This talk will present our work on acoustic word embeddings, including a variety of models in unsupervised, weakly supervised, and supervised settings.