Real-Time Multi-Source Detection
From IMTR
Contents |
Introduction
This page presents the research project pursued at IRCAM on the problem of real-time multi-source detection. The goal of this project is to design a system that analyzes an audio stream and detects overlapping sources that may be corrupted by background noise. The devised system is required to be:
- Capable of processing rapidly the incoming audio stream as it unfolds in time.
- Able to detect the presence of several potentially overlapping sound sources.
- Robust against background noise in realistic environmental scenarios.
Proposed System
The system relies on non-negative matrix factorization techniques. More precisely, the problem of real-time multi-source detection is addressed with non-negative decomposition, a modified non-negative matrix factorization scheme where the incoming signal is projected onto a basis of templates learned off-line prior to the decomposition. An important drawback of existing approaches in this context is the lack of controls on the decomposition. We have developed algorithms that address this issue, by controlling the sparsity of the decomposition or the frequency trade-off of the decomposition in the different frequency components.
Examples
Polyphonic music transcription
See Further Readings and our page on real-time transcription.
Drum transcription
Here are the two drum loops that accompany our last paper:
Environmental sound detection
Here are the three environmental scenes that accompany our last paper:
Contributors
Further Readings
- Arnaud Dessein, Arshia Cont, and Guillaume Lemaitre. Real-time detection of overlapping sound events with non-negative matrix factorization. In Frank Nielsen and Rajendra Bhatia, editors, Matrix Information Geometry, chapter 14, pages 341–371. Springer, Berlin/Heidelberg, Germany, 2013. (draft) (bibtex)
- Arnaud Dessein, Arshia Cont, and Guillaume Lemaitre. Real-time polyphonic music transcription with non-negative matrix factorization and beta-divergence. In 11th International Society for Music Information Retrieval Conference (ISMIR), pages 489–494, Utrecht, Netherlands, August 2010. (paper) (bibtex) (poster)
- Arnaud Dessein, Arshia Cont, and Guillaume Lemaitre. Real-time polyphonic music transcription with non-negative matrix factorization and beta-divergence. In 6th Music Information Retrieval Evaluation eXchange (MIREX), Utrecht, Netherlands, August 2010. (abstract) (bibtex) (web)
- Arnaud Dessein. Incremental multi-source recognition with non-negative matrix factorization. Master's thesis, Université Pierre et Marie Curie, Paris, France, June 2009. (report) (bibtex) (slides) (web)