Real-Time Multi-Source Detection

From IMTR

Contents

Introduction

This page presents the research project pursued at IRCAM on the problem of real-time multi-source detection. The goal of this project is to design a system that analyzes an audio stream and detects overlapping sources that may be corrupted by background noise. The devised system is required to be:

  • Capable of processing rapidly the incoming audio stream as it unfolds in time.
  • Able to detect the presence of several potentially overlapping sound sources.
  • Robust against background noise in realistic environmental scenarios.

Proposed System

The system relies on non-negative matrix factorization techniques. More precisely, the problem of real-time multi-source detection is addressed with non-negative decomposition, a modified non-negative matrix factorization scheme where the incoming signal is projected onto a basis of templates learned off-line prior to the decomposition. An important drawback of existing approaches in this context is the lack of controls on the decomposition. We have developed algorithms that address this issue, by controlling the sparsity of the decomposition or the frequency trade-off of the decomposition in the different frequency components.

Examples

Polyphonic music transcription

See Further Readings and our page on real-time transcription.

Drum transcription

Here are the two drum loops that accompany our last paper:

Environmental sound detection

Here are the three environmental scenes that accompany our last paper:

Contributors

Further Readings

  • Arnaud Dessein, Arshia Cont, and Guillaume Lemaitre. Real-time detection of overlapping sound events with non-negative matrix factorization. In Frank Nielsen and Rajendra Bhatia, editors, Matrix Information Geometry, chapter 14, pages 341–371. Springer, Berlin/Heidelberg, Germany, 2013. (draft) (bibtex)
  • Arnaud Dessein, Arshia Cont, and Guillaume Lemaitre. Real-time polyphonic music transcription with non-negative matrix factorization and beta-divergence. In 11th International Society for Music Information Retrieval Conference (ISMIR), pages 489–494, Utrecht, Netherlands, August 2010. (paper) (bibtex) (poster)
  • Arnaud Dessein, Arshia Cont, and Guillaume Lemaitre. Real-time polyphonic music transcription with non-negative matrix factorization and beta-divergence. In 6th Music Information Retrieval Evaluation eXchange (MIREX), Utrecht, Netherlands, August 2010. (abstract) (bibtex) (web)
  • Arnaud Dessein. Incremental multi-source recognition with non-negative matrix factorization. Master's thesis, Université Pierre et Marie Curie, Paris, France, June 2009. (report) (bibtex) (slides) (web)
  • Arshia Cont, Shlomo Dubnov, and David Wessel. Realtime multiple-pitch and multiple-instrument recognition for music signals using sparse non-negative constraints. In 10th International Conference on Digital Audio Effects (DAFx). Bordeaux, France, September 2007. (pdf) (bibtex)
  • Arshia Cont, Realtime multiple pitch observation using sparse non-negative constraints. In 7th International Symposium on Music Information Retrieval (ISMIR). Victoria, Canada, October 2006. (pdf) (bibtex)
Personal tools