Textbooks edited and authored by the members of our research group.

Computational Analysis of Sound Scenes and Events

This book presents computational methods for extracting the useful information from audio signals, collecting the state of the art in the field of sound event and scene analysis. The authors cover the entire procedure for developing such methods, ranging from data acquisition and labeling, through the design of taxonomies used in the systems, to signal processing methods for feature extraction and machine learning methods for sound recognition. The book also covers advanced techniques for dealing with environmental variation and multiple overlapping sound sources, and taking advantage of multiple microphones or other modalities. The book gives examples of usage scenarios in large media databases, acoustic monitoring, bioacoustics, and context-aware devices. Graphical illustrations of sound signals and their spectrographic representations are presented, as well as block diagrams and pseudocode of algorithms.

Editors: Tuomas Virtanen, Mark D. Plumbley, and Dan Ellis

Chapter authors: Annamaria Mesaros, Toni Heittola, Emre Cakir and Tuomas Virtanen

Techniques for Noise Robustness in Automatic Speech Recognition

Automatic speech recognition (ASR) systems are finding increasing use in everyday life. Many of the commonplace environments where the systems are used are noisy, for example users calling up a voice search system from a busy cafeteria or a street. This can result in degraded speech recordings and adversely affect the performance of speech recognition systems. As the use of ASR systems increases, knowledge of the state-of-the-art in techniques to deal with such problems becomes critical to system and application engineers and researchers who work with or on ASR technologies. This book presents a comprehensive survey of the state-of-the-art in techniques used to improve the robustness of speech recognition systems to these degrading external influences.

Editors: Tuomas Virtanen, Rita Singh, and Bhiksha Raj

Chapter authors: Tuomas Virtanen

Signal Processing Methods for Music Transcription

Signal Processing Methods for Music Transcription is the first book dedicated to uniting research related to signal processing algorithms and models for various aspects of music transcription such as pitch analysis, rhythm analysis, percussion transcription, source separation, instrument recognition, and music structure analysis. Following a clearly structured pattern, each chapter provides a comprehensive review of the existing methods for a certain subtopic while covering the most important state-of-the-art methods in detail. The concrete algorithms and formulas are clearly defined and can be easily implemented and tested. A number of approaches are covered, including, for example, statistical methods, perceptually-motivated methods, and unsupervised learning methods. The text is enhanced by a common reference and index.

This book aims to serve as an ideal starting point for newcomers and an excellent reference source for people already working in the field. Researchers and graduate students in signal processing, computer science, acoustics and music will primarily benefit from this text. It could be used as a textbook for advanced courses in music signal processing. Since it only requires a basic knowledge of signal processing, it is accessible to undergraduate students.

Editors: Anssi Klapuri and Manuel Davy

Chapter authors: Jouni Paulus, Tuomas Virtanen, Matti Ryynänen and Anssi Klapuri