Corpus-Based Sound Synthesis Survey
This page offers a comparative survey and taxonomy of the many different approaches to corpus-based concatenative sound synthesis throughout the history of electronic music, starting in the 1950s, even if they weren't known as such at their time, up to the recent surge of contemporary methods.
It is the continuously updated online version of this article published in the JNMR special issue on Audio Mosaicing, guest editor Adam T. Lindsay.
A summary until 2005 can be found in this figure. It shows a comparison of musical sound synthesis methods according to selection (y-axis) and analysis (x-axis), use of concatenation quality (bold), and real-time capabilities (italics)}
Recent entries are listed here by year:
- ScrambledHacks by Sven König --- spectral lookup alongside with ganged music video concatenation!
- New spectral explorations by Jean-François Charles --- spectral frame mosaicing in Jitter
- Concatenative Synthesis Using Score-Aligned Transcriptions. ICMC 2006, Roger Dannenberg's take on chroma (i.e. pitch-class) based lookup of notes or chords from classical music
- Using Concatenative Synthesis for Expressive Performance in Jazz Saxophone. ICMC 2006, Maestre, E., Hazan, A., Ramirez, R., and Perez, A.
- Meapsoft --- java software for descriptor-based concatenative reordering
- Corpus-Based Concatenative Synthesis: Assembling sounds by content-based selection of units from large sound databases. IEEE Signal Processing Magazine Special Section: Signal Processing for Sound Synthesis, Diemo Schwarz. --- New reference article about CBCS.
Presented at NIME 2007, New York:
- Chroma Palette: Chromatic Maps of Sound As Granular Syntheis Interface (Justin Donaldson, Ian Knopke, Chris Raphael) --- Uses a 2D interface for (usably real-time?) interactive browsing in a sound space based on chroma (pitch classes). The 20 descriptor dimensions (12 pitch classes + 8 octave classes) are reduced to 2 dimensions by a multi-dimensional scaling approach.
This is the harvest of this year's ICMC, Copenhaguen:
- MUSICAL APPLICATIONS OF REAL-TIME CORPUS-BASED CONCATENATIVE SYNTHESIS (Diemo Schwarz, Ircam-CNRS STMS, France; Sam Britton, freelance, United Kingdom; Roland Cahen, ENSCI, France; Thomas Goepfer, Ircam-Centre Pompidou, France) --- paper about recent pieces and performances using CataRT
- MUSED: NAVIGATING THE PERSONAL SAMPLE LIBRARY (Graham Coleman, United States) --- A system resembling CataRT for sample-based music composition. It analyses an input song, segments it into notes and characterises each segment by pitch, loudness, brilliance, etc. Then, the segments are layed out in 2D space and can be choosen and played by clicking on them. There is an innovative reordering feature of the display, that places the segments following to the clicked-on segment in a row close to it.
- BIOTOOLS: INTRODUCING A HARDWARE AND SOFTWARE TOOLKIT FOR FAST IMPLEMENTATION OF BIOSIGNALS FOR MUSICAL APPLICATIONS (Miguel Angel Ortiz Pérez, Benjamin Knapp, Sonic Arts Research Centre, United Kingdom) --- Uses CataRT for performances controlled by a muscular-tension biosensor. See this video of a performance.
- AUDIOVISUAL CONCATENATIVE SYNTHESIS (Nick Collins, Sussex University, United Kingdom) --- uses 5 audio + 5 video descriptors, audio or video-driven selection of audio+video segment from a database. Matching of a whole sequence of audio/video by « shingling » (cf. Casey).
- SOUNDSPOTTER / REMIX-TV: FAST APPROXIMATE MATCHING FOR AUDIO AND VIDEO PERFORMANCE (Michael Casey, Mick Grierson, Goldsmiths, University of London, United Kingdom) --- Audio-driven audio+video lookup and concatenation using Casey’s SoundSpotter. Matching of a whole sequence of audio/video by « shingling ». References Michel Chion’s work about film-sound. (cf. Collins)
- GUIDAGE: A FAST AUDIO QUERY GUIDED ASSEMBLAGE (Arshia Cont, UCSD / Ircam, France; Dubnov Shlomo, University of California in San Diego, United States; Gérard Assayag, Ircam, France) --- Factor-Oracle based implicit-segmentation audio-driven recomposition
- DataJockey (Alex Norman, Xavier Amatriain): a tool to help a digital DJ select tracks based on meta-data and sound descriptors. --- This is corpus-based synthesis on a larger time-scale, that of a DJ set.
- Principles and Applications of Interactive Corpus-Based Concatenative Synthesis. JIM 2008, Diemo Schwarz, Roland Cahen, Sam Britton. --- New reference article about CataRT and its musical applications.
- Extending voice-driven synthesis to audio mosaicing, Jordi Janer, Maarten de Boer, Music Technology Group, Universitat Pompeu Fabra, Barcelona, SMC 2009. Abstract: This paper presents a system for controlling audio mosaicing with a voice signal, which can be interpreted as a further step in voice-driven sound synthesis. Compared to voice-driven instrumental synthesis, it increases the variety in the synthesized timbre. Also, it provides a more direct interface for audio mosaicing applications, where the performer voice controls rhythmic, tonal and timbre properties of the output sound. In a ﬁrst step, voice signal is segmented into syllables, extracting a set of acoustic features for each segment. In the concatenative synthesis process, the voice acoustic features (target) are used to retrieve the most similar segment from the corpus of audio sources. We implemented a system working in pseudo-realtime, which analyzes voice input and sends control messages to the concatenative synthesis module. Additionally, this work raises questions to be further explored about mapping the input voice timbre space onto the audio sources timbre space.
- Beyond Concatenation: Some Ideas for the Creative Use of Corpus-based Sonic Material, ICMC 2009, Thomas Stoll. 
- Corpus-Based Transcription as an Approach to the Compositional Control of Timbre, ICMC 2009, Aaron Einbond, Diemo Schwarz, Jean Bresson. --- Using CataRT and related tools in the FTM and Gabor libraries for Max/MSP we describe a technique for real-time analysis of a live signal to pilot corpus-based synthesis, along with examples of compositional realizations in works for instruments, electronics, and sound installation. To extend this technique to computer-assisted composition for acoustic instruments, we develop tools using the Sound Description Interchange Format (SDIF) to export sonic descriptors to OpenMusic where they may be further manipulated and transcribed into an instrumental score. This presents a flexible technique for the compositional organization of noise-based instrumental sounds.
- Sound Object Classification for Symbolic Audio Mosaicing: A Proof-of-Concept, SMC 2009, Jordi Janer, Martin Haro, Gerard Roma, Takuya Fujishima, Naoaki Kojima,
- BotTalk by Ben Smith "generates mp3 mashups of a last.fm user's top artists. The mashup is created from mp3s from each artist. These are sorted according to pitch and each track has a drum track created by taking the bars from the next track that match the key of the current track. Each track is then mixed with it's beat track, with the beat track looped. The resulting mix tracks are then ordered by loudness and about 30 seconds of each are bolted together to create the final mix-tape." --- corpus-based concatenative mashup synthesis (noticed by HarS)
- Auditory Spectral Summarisation for Audio Signals with Musical Applications, ISMIR 2009, Sam Ferguson and Densil Cabrera, uses the temporal reduction capability of concatenative synthesis for auditory display and exploration of audio spectra.
- TimbreID is a collection of PD externals that allow descriptor analysis, corpus-based concatenative synthesis and CataRT-style 2D interaction. (found by Bruno Ruviaro)
- Expressive Concatenative Synthesis by Reusing Samples from Real Performance Recordings. CMJ 2009, Esteban Maestre, Rafael Ramirez, Stefan Kersten, Xavier Serra. --- culmination of the research work on expression modeling and concatenative synthesis since 2004.
This fruitful year saw many applications of CBCS and the first concatenative iPhone App!
- Concat by researcher-musician Nick Collins at Sussex University (not musician-researcher Nicolas Collins) is an application for iPhone that does live audio controlled musaicing on a corpus of self-recorded sounds. The matching is based on four timbral features, and you can choose the size of the matched segment played back. As Nick puts it: Found myself doing various bits of iphone hacking in the last year, latest app is a realtime concatenative synthesizer (strangely called Concat).
- Surfing the Waves: Live Audio Mosaicing of an Electric Bass Performance as a Corpus Browsing Interface. NIME 2010, Pierre Alexandre Tremblay, Diemo Schwarz.  --- Controlling CataRT live by an electric bass.
- Mechanisms for Controlling Complex Sound Sources: Applications to Guitar Feedback Control. NIME 2010, Aengus Martin, Sam Ferguson, and Kirsty Beilharz.  --- Using CBCS for inversion of a complex synthesis model, here a feedbacking robotic guitar.
- Grainstick: A Collaborative, Interactive Sound Installation. ICMC 2010, Grace Leslie et. al.  --- Uses a tiny bit of CataRT for a sound installation composed by Pierre Jodlowski.
- Spatializing Timbre with Corpus-Based Concatenative Synthesis. Einbond Aaron, Schwarz Diemo, ICMC 2010 --- Using CataRT and the FTM library for Max/MSP we develop a technique for the organization of a navigation space for synthesis based on user-defined spatial zones and informed by the perceptual concept of timbre space. The goal is to place the listener in the midst of a virtual space of sounds organized by their descriptor data, simulating an immersion in timbre space.
- A Modular Sound Descriptor Analysis Framework for Relaxed-real-time Applications. ICMC 2010, Schwarz Diemo, Schnell Norbert --- Describes the future CataRT import architecture.
- A Timbre Analysis and Classification Toolkit for Pure Data. ICMC 2010, William Brent. --- Publication of TimbreID PD externals mentioned 2009.
- A Database System for Organizing Musique Concrete. ICMC 2010, Christopher Bailey. --- CBCS in FileMaker on hand-assigned descriptors!
- Abstraction in a Unit-based Audio Processing System. ICMC 2010, Tom Stoll. --- Requirements for Tom's version of CataRT
- (Ab)using MIR to create music: corpus-based synthesis and audio mosaicing SMC 2010 tutorial, Diemo Schwarz.
- Descriptor-based Sound Texture Sampling. SMC 2010, Diemo Schwarz and Norbert Schnell. --- Application of CataRT to environmental sound texture synthesis.
- Timbre Remapping through a Regression-Tree Technique. SMC 2010, Dan Stowell, MD Plumbley. --- Solves the problem of mapping one (controlling) corpus to a source corpus raised by Tremblay and Schwarz
- Augmenting Sound Mosaicing with Descriptor-Driven Transformation. DAFx 2010, Graham Coleman, Esteban Maestre, Jordi Bonada. --- This scheme has also been implemented as one of the hacks of the Barcelona Music Hack Day: Remix Sound Harmonizer/MakeItRick
- Nuvolet : 3d gesture-driven collaborative audio mosaicing, Josep M Comajuncosas, Enric Guaus, Alex Barrachina, and John O'Connell. --- a 3D positional interface to CataRT using the Kinect for collaborative (well, not really) exploration of sound clouds and interactive target-based audio mosaicing, for a piece by composer Ariadna Alsina. The article adds an estimation of the defining features of the new interface in comparison to other free-space interfaces. As the authors correctly say: "the challenges inherent in the design of open-air interfaces pose some unavoidable issues which deserve further research."
- Pioritizing Audio Features Selection Using Analysis Hierarchy Process As A Mean To Extend User Control in Concatenative Sound Synthesis. ICMC 2011, Noris Mohd Norowi, Eduardo Reck Miranda. --- Introduces us to the AHP (Analysis Hierarchy Process from decision sciences) method to inverse user priority judgments in order to derive the selection weights for CBCS, but otherwise misses its point since that does not actually prioritise the selection. See our SMC 2011 article for an elegant way to integrate actual step by step priorities with the geometric unit search.
- CBPSC: Corpus-Based Processing for Supercollider. ICMC 2011, Thomas Stoll. --- Implementation of CataRT-based CBP (see ICMC 2009 above) in Supercollider.
Do you know more?
If you know of any software, research project, or musical approach that is somehow based on a database of sounds, please let me know by mail to schwarz (at) ircam (dot) fr
- ↑ Schwarz, D. (2006). Concatenative Sound Synthesis: The Early Years. Journal of New Music Research, 35(1), 3. Bib
- ↑ Maestre, E., Hazan, A., Ramirez, R., & Perez, A. (2006). Using Concatenative Synthesis for Expressive Performance in Jazz Saxophone. Paper presented at International Computer Music Conference (ICMC). Bib
- ↑ Schwarz, D. (2007). Corpus-Based Concatenative Synthesis. IEEE Signal Processing Magazine, 24(2), 92. Bib
- ↑ Schwarz, D., Britton, S., Cahen, R., & Goepfer, T. (2007). Musical Applications of Real-Time Corpus-Based Concatenative Synthesis. Paper presented at International Computer Music Conference (ICMC), Copenhagen, Denmark. Bib
- ↑ Schwarz, D., Cahen, R., & Britton, S. (2008). Principles and Applications of Interactive Corpus-Based Concatenative Synthesis. Paper presented at Journées d'Informatique Musicale (JIM), GMEA, Albi, France. Bib
- ↑ 6.0 6.1 Stoll, T. (2009). Beyond Concatenation: Some Ideas for the Creative Use of Corpus-based Sonic Material. Paper presented at International Computer Music Conference (ICMC), Montreal, QC. Bib
- ↑ Einbond, A., Schwarz, D., & Bresson, J. (2009). Corpus-Based Transcription as an Approach to the Compositional Control of Timbre. Paper presented at International Computer Music Conference (ICMC), Montreal, QC, Canada. Bib
- ↑ Maestre, E., Ramírez, R., Kersten, S., & Serra, X. (2009). Expressive Concatenative Synthesis by Reusing Samples from Real Performance Recordings. Computer Music Journal, 33(4), 23. Bib
- ↑ 9.0 9.1 Tremblay, P. A., & Schwarz, D. (2010). Surfing the Waves : Live Audio Mosaicing of an Electric Bass Performance as a Corpus Browsing Interface. Paper presented at Conference for New Interfaces for Musical Expression, Sydney, Australia. Bib
- ↑ Martin, A., Ferguson, S., & Beilharz, K. (2010). Mechanisms for Controlling Complex Sound Sources: Applications to Guitar Feedback Control. Paper presented at Conference for New Interfaces for Musical Expression, Sydney, Australia. Bib
- ↑ Leslie, G., Schwarz, D., Warusfel, O., Bevilacqua, F., Zamborlin, B., Jodlowski, P., & Schnell, N. (2010). Grainstick: A Collaborative, Interactive Sound Installation. Paper presented at International Computer Music Conference (ICMC), New York, NY. Bib
- ↑ Einbond, A., & Schwarz, D. (2010). Spatializing Timbre with Corpus-Based Concatenative Synthesis. Paper presented at International Computer Music Conference (ICMC), New York, NY. Bib
- ↑ Schwarz, D., & Schnell, N. (2010). A Modular Sound Descriptor Analysis Framework for Relaxed-real-time Applications. Paper presented at International Computer Music Conference (ICMC), New York, NY. Bib
- ↑ Schwarz, D., & Schnell, N. (2010). Descriptor-based Sound Texture Sampling. Paper presented at International Conference on Sound and Music Computing (SMC), Barcelona, Spain. Bib
- ↑ Stowell, D., & Plumbley, M. (2010). Timbre Remapping through a Regression-Tree Technique. Bib