Corpus-Based Sound Synthesis Survey


This page offers an ongoing bibliographic survey of research work and approaches to corpus-based concatenative sound synthesis (CBCS).

It is the continuously updated online version of this article published in the JNMR special issue on Audio Mosaicing, guest editor Adam T. Lindsay.[1], tracking the many different approaches to CBCS throughout the history of electronic music, starting in the 1950s, even if they weren't known as such at their time, up to the recent surge of automatic contemporary methods.

If you know of any software, research project, or musical approach that is somehow based on a database of sounds, please let me know by mail to schwarz (at) ircam (dot) fr Thanks to the current contributors: Giuseppe Torre, Abram Hindle, Nick Collins.

Here's the bibliography by year:


  • A Joyful Ode to Automatic Orchestration, François Pachet. --- SONY CSL's flow machines project to reorchestrate Beethoven's Freude schöner Götterfunken, in seven styles, one of them synthesised by concatentive synthesis of Bossa-Nova recordings.
  • Coleman, Graham. "Descriptor Control of Sound Transformations and Mosaicing Synthesis.", PhD thesis, UPF, Barcelona, 2016.
  • Bilous, James Eric. "Concatenative Synthesis for Novel Timbral Creation." (2016).
  • Ghisi, Daniele, and Carlos Agon. Real-Time Corpus-Based Concatenative Synthesis for Symbolic Notation, TENOR 2016.

ICMC 2016

  • Evaluation of a Sketching Interface to Control a Concatenative Synthesiser. Augoustinos Tsiros, Grégory Leplâtre. --- continuation of the beautiful work on transforming graphical texture, shape, and colour to sound via CBCS for sound design.
  • Introducing CatOracle: Corpus-based concatenative improvisation with the Audio Oracle algorithm, Aaron Einbond, Diemo Schwarz, Riccardo Borghesi, Norbert Schnell.
  • Concatenative Synthesis via Chord-Based Segmentation For "An Experiment with Time", Daniele Ghisi, Mattia G. Bergomi.


  • Lindborg, PerMagnus, and Joyce Beetuan Koh. "About When We Collide: A Generative and Collaborative Sound Installation." Sound and Interactivity 20 (2015): 104.


  • Composing Music by Selection: Content-Based Algorithmic-Assisted Audio Composition. Bernardes, G. PhD dissertation on earGram, University of Porto.
  • Garcia, Jérémie. "Supporting music composition with interactive paper." PhD diss., Université Paris Sud-Paris XI, 2014. --- beautiful use-case of recording 2D descriptor trajectories during composition with CataRT on interactive paper to be conserved and annotated.

NIME 2014

  • Davies, Matthew EP, Adam M. Stark, Fabien Gouyon, and Masataka Goto. "Improvasher: A Real-Time Mashup System for Live Musical Input." In NIME, pp. 541-544. 2014.
  • Gossmann, Joachim, and Max Neupert. "Musical Interface to Audiovisual Corpora of Arbitrary Instruments." In NIME, pp. 151-154. 2014.


  • Query-by-multiple-examples: content-based search in computer-assisted sound-based musical composition. Tiago Fernandes Tavares, Jônatas Manzolli.
  • Fine-tuned control of concatenative synthesis with CataRT using the Bach Library for Max. Aaron Einbond, Christopher Trapani, Andrea Agostini, Daniele Ghisi, Diemo Schwarz.
  • Considering Roughness to Describe and Generate Vertical Musical Structure in Content-Based Algorithmic-Assisted Audio Composition. Gilberto Bernardes, Matthew E. P. Davies, Carlos Guedes, Bruce Pennycook. --- earGram's high-level control using loudness and roughness


  • Composing Morphology: Concatenative Synthesis as an Intuitive Medium for Prescribing Sound in Time. Benjamin Hackbarth, Norbert Schnell, Philippe Esling, Diemo Schwarz. Contemporary Music Review. April 2013, vol. 32, n° 1, p. 49-59.
  • Dirty Tangible Interfaces: Expressive Control of Computers with True Grit. Matthieu Savary, Diemo Schwarz, Denis Pellerin, Florence Massin, Christian Jacquemin, Roland Cahen. CHI 2013. Paris, p. 2991-2994.
  • Keynote address Interacting with a Corpus of Sounds. Diemo Schwarz. SI13: NTU/ADM Symposium on Sound and Interactivity. Singapore.
  • Costello, Edward, Victor Lazzarini, and Joseph Timoney. "A streaming audio mosaicing vocoder implementation." (2013): 194-201. --- audio mosaicing per spectral band(!)

CMMR 2013

  • Constellation: A Tool for Creative Dialog Between Audience and Composer. Akito van Troyer. To our knowledge, Constellation is the first web browser-based application that used real-time concatenative synthesis as a running synthesis engine for an asynchronous collaborative and creative music composition.
  • Retexture — Towards Interactive Environmental Sound Texture Synthesis through Inversion of Annotations. Diemo Schwarz
  • Design of a Customizable Timbre Space Synthesizer. Daniel Gómez, Rafael Vega and Carlos Arce-Lopera.


  • A. Refsum Jensenius and V. Johnson, "Performing the Electric Violin in a Sonic Space," in Computer Music Journal, vol. 36, no. 4, pp. 28-39, Dec. 2012.
  • Zappi, Victor, Dario Mazzanti, Andrea Brogni, and Darwin Caldwell. "Concatenative Synthesis Unit Navigation and Dynamic Rearrangement in vrGrains." (2012). --- 3D corpus browsing installation
  • van Troyer, Akito. "Constellation: A Tool for Creative Dialog Between Audience and Composer."
  • Norowi, Noris Mohd. "An Artificial Intelligence Approach to Concatenative Sound Synthesis." PhD diss., University of Plymouth, 2013.
  • Murray-Rust, Dave, and Rocio von Jungenfeld. "Thawing colours: dangling from the fuzzy end of interfaces." In Fourth International Workshop on Physicality, p. 32. 2012.
  • Peláez, Pablo Molina. "DJ Support Agent Based on Audio Mosaicing: Enhancing DJ-Skills for People with or without them." PhD diss., Universitat Pompeu Fabra, 2010.

CMMR 2012

  • EarGram: an Application for Interactive Exploration of Large Databases of Audio Snippets for Creative Purposes. Bernardes, G., Guedes, C., and Pennycook, B.
  • EarGram: An application for interactive exploration of concatenative sound synthesis in Pure Data. Bernardes, G., Guedes, C., & Pennycook, B. --- Post-proceeding publication in LNCS: M. Aramaki, M. Barthet, R. Kronland-Martinet, & S. Ystad (Eds.), From sounds to music and emotions (pp. 110-129). Berlin-Heidelberg: Springer-Verlag.

NIME 2012

  • The Sound Space as Musical Instrument: Playing Corpus-Based Concatenative Synthesis. Diemo Schwarz.
  • DIRTI — Dirty Tangible Interfaces. Matthieu Savary, Diemo Schwarz, Denis Pellerin.
  • Castet, Julien. "Performing experimental music by physical simulation." In NIME. 2012.
  • Rébillat, Marc. "Vibrations de plaques multi-excitateurs de grandes dimensions pour la création d’environnements virtuels audio-visuels." PhD diss., École Polytechnique, 2012.

ICMC 2012

  • Browsing Music and Sound Using Gestures in a Self-Organized 3D Space. Gabrielle Odowichuk and George Tzanetakis. --- approach very similar to CataRT in 3D, but omitting to giving reference
  • Precise Pitch Control in Real Time Corpus-Based Concatenative Synthesis. Aaron Einbond, Christopher Trapani and Diemo Schwarz
  • Navigating Variation: Composing for Audio Mosaicing. Diemo Schwarz and Benjamin Hackbarth.


  • Distance Mapping for Corpus-Based Concatenative Synthesis. Diemo Schwarz. Sound and Music Computing (SMC), Padova.
  • Nuvolet : 3d gesture-driven collaborative audio mosaicing, Josep M Comajuncosas, Enric Guaus, Alex Barrachina, and John O'Connell. NIME 2011. --- a 3D positional interface to CataRT using the Kinect for collaborative (well, not really) exploration of sound clouds and interactive target-based audio mosaicing, for a piece by composer Ariadna Alsina. The article adds an estimation of the defining features of the new interface in comparison to other free-space interfaces. As the authors correctly say: "the challenges inherent in the design of open-air interfaces pose some unavoidable issues which deserve further research."
  • Corpus-Based Improvisation. Diemo Schwarz, Victoria Johnson. (Re)thinking Improvisation. Malmö.
  • La collection numérique comme modèle pour la synthèse sonore, la composition et l’exploration multimédia interactives. Alain Bonardi, Francis Rousseaux, Diemo Schwarz, Benjamin Roadley. Revue Musimédiane. Juin 2011, n° 6
  • Martin, Augusto Gabriel Vigliensoni. "Touchless gestural control of concatenative sound synthesis." PhD diss., McGill University, 2011.
  • Aperghis, Georges, and Grégory Beller. "Contrôle gestuel de la synthèse concaténative en temps réel dans Luna Park." Rapport de recherche et développement (2011).
  • Molina, Pablo, Martín Haro, and Sergi Jordá. "BeatJockey: A New Tool for Enhancing DJ Skills." In NIME, pp. 288-291. 2011.

ICMC 2011

  • Prioritizing Audio Features Selection Using Analysis Hierarchy Process As A Mean To Extend User Control in Concatenative Sound Synthesis. ICMC 2011, Noris Mohd Norowi, Eduardo Reck Miranda. --- Introduces us to the AHP (Analysis Hierarchy Process from decision sciences) method to inverse user priority judgments in order to derive the selection weights for CBCS, but otherwise misses its point since that does not actually prioritise the selection. See our SMC 2011 article for an elegant way to integrate actual step by step priorities with the geometric unit search.
  • CBPSC: Corpus-Based Processing for Supercollider. ICMC 2011, Thomas Stoll. --- Implementation of CataRT-based CBP (see ICMC 2009 above) in Supercollider.
  • Yee-King, Matthew John. "An autonomous timbre matching improviser." In ICMC. 2011.


This fruitful year saw many applications of CBCS and the first concatenative iPhone App!

  • Concat by researcher-musician Nick Collins at Sussex University (not musician-researcher Nicolas Collins) is an application for iPhone that does live audio controlled musaicing on a corpus of self-recorded sounds. The matching is based on four timbral features, and you can choose the size of the matched segment played back. As Nick puts it: Found myself doing various bits of iphone hacking in the last year, latest app is a realtime concatenative synthesizer (strangely called Concat).
  • A video based analysis system for realtime control of concatenative sound synthesis and spatialisation. Jensenius, Alexander Refsum; Johnson, Victoria. Proceedings of the second Norwegian Artificial Intelligence Symposium : 2010, 85-88.

NIME 2010

ICMC 2010

  • Spatializing Timbre with Corpus-Based Concatenative Synthesis. Einbond Aaron, Schwarz Diemo, ICMC 2010[1] --- Using CataRT and the FTM library for Max/MSP we develop a technique for the organization of a navigation space for synthesis based on user-defined spatial zones and informed by the perceptual concept of timbre space. The goal is to place the listener in the midst of a virtual space of sounds organized by their descriptor data, simulating an immersion in timbre space.
  • A Timbre Analysis and Classification Toolkit for Pure Data. ICMC 2010, William Brent. --- Publication of TimbreID PD externals mentioned 2009.
  • A Database System for Organizing Musique Concrete. ICMC 2010, Christopher Bailey. --- CBCS in FileMaker on hand-assigned descriptors!
  • Abstraction in a Unit-based Audio Processing System. ICMC 2010, Tom Stoll. --- Requirements for Tom's version of CataRT[1]

SMC 2010

DAFx 2010


  • Corpus-Based Transcription as an Approach to the Compositional Control of Timbre, ICMC 2009, Aaron Einbond, Diemo Schwarz, Jean Bresson.[1] --- Using CataRT and related tools in the FTM and Gabor libraries for Max/MSP we describe a technique for real-time analysis of a live signal to pilot corpus-based synthesis, along with examples of compositional realizations in works for instruments, electronics, and sound installation. To extend this technique to computer-assisted composition for acoustic instruments, we develop tools using the Sound Description Interchange Format (SDIF) to export sonic descriptors to OpenMusic where they may be further manipulated and transcribed into an instrumental score. This presents a flexible technique for the compositional organization of noise-based instrumental sounds.
  • BotTalk by Ben Smith "generates mp3 mashups of a user's top artists. The mashup is created from mp3s from each artist. These are sorted according to pitch and each track has a drum track created by taking the bars from the next track that match the key of the current track. Each track is then mixed with it's beat track, with the beat track looped. The resulting mix tracks are then ordered by loudness and about 30 seconds of each are bolted together to create the final mix-tape." --- corpus-based concatenative mashup synthesis (noticed by HarS)
  • TimbreID is a collection of PD externals that allow descriptor analysis, corpus-based concatenative synthesis and CataRT-style 2D interaction. (found by Bruno Ruviaro)

ICMC 2009

  • Beyond Concatenation: Some Ideas for the Creative Use of Corpus-based Sonic Material, ICMC 2009, Thomas Stoll. [1]

SMC 2009

  • Sound Search by Content-based Navigation in Large Databases, SMC 2009, Diemo Schwarz and Norbert Schnell. --- applies interactive corpus-based browsing based on a 2D projection of the descriptor space (very similar to CataRT) combined with quick selection of classes to searching sounds in large instrument or Fx sound databases.


  • Extending voice-driven synthesis to audio mosaicing, Jordi Janer, Maarten de Boer, Music Technology Group, Universitat Pompeu Fabra, Barcelona, SMC 2009. Abstract: This paper presents a system for controlling audio mosaicing with a voice signal, which can be interpreted as a further step in voice-driven sound synthesis. Compared to voice-driven instrumental synthesis, it increases the variety in the synthesized timbre. Also, it provides a more direct interface for audio mosaicing applications, where the performer voice controls rhythmic, tonal and timbre properties of the output sound. In a first step, voice signal is segmented into syllables, extracting a set of acoustic features for each segment. In the concatenative synthesis process, the voice acoustic features (target) are used to retrieve the most similar segment from the corpus of audio sources. We implemented a system working in pseudo-realtime, which analyzes voice input and sends control messages to the concatenative synthesis module. Additionally, this work raises questions to be further explored about mapping the input voice timbre space onto the audio sources timbre space.
  • Hagevold, Jan Erik. "An Approach to Real-Time Content-Based Audio Synthesis." PhD diss., New York University, 2008.


  • Corpus-Based Concatenative Synthesis: Assembling sounds by content-based selection of units from large sound databases. IEEE Signal Processing Magazine Special Section: Signal Processing for Sound Synthesis, Diemo Schwarz.[1] --- New reference article about CBCS.
  • Benovoy, Mitchel, Andrew Brouse, Thomas Greg Corcoran, Hannah Drayson, Cumhur Erkut, Jean-Julien Filatriau, Christian Frisson et al. "Audiovisual content generation controlled by physiological signals for clinical and artistic applications." (2007).
  • Penrose, Joshua, and Gerald Philips. "Putting the Pieces Together." (2007).

NIME 2007

  • Chroma Palette: Chromatic Maps of Sound As Granular Syntheis Interface (Justin Donaldson, Ian Knopke, Chris Raphael) --- Uses a 2D interface for (usably real-time?) interactive browsing in a sound space based on chroma (pitch classes). The 20 descriptor dimensions (12 pitch classes + 8 octave classes) are reduced to 2 dimensions by a multi-dimensional scaling approach.

ICMC 2007

  • Musical Applications of Real-Time Corpus-Based Concatenative Synthesis (Diemo Schwarz, Ircam-CNRS STMS, France; Sam Britton, freelance, United Kingdom; Roland Cahen, ENSCI, France; Thomas Goepfer, Ircam-Centre Pompidou, France)[1] --- paper about recent pieces and performances using CataRT
  • MUSED: Navigating the Personal Sample Library (Graham Coleman, United States) --- A system resembling CataRT for sample-based music composition. It analyses an input song, segments it into notes and characterises each segment by pitch, loudness, brilliance, etc. Then, the segments are layed out in 2D space and can be choosen and played by clicking on them. There is an innovative reordering feature of the display, that places the segments following to the clicked-on segment in a row close to it.
  • Biotools: Introducing a Hardware and Software Toolkit for Fast Implementation of Biosignals for Musical Applications (Miguel Angel Ortiz Pérez, Benjamin Knapp, Sonic Arts Research Centre, United Kingdom) --- Uses CataRT for performances controlled by a muscular-tension biosensor. See this video of a performance.
  • Audiovisual Concatenative Synthesis (Nick Collins, Sussex University, United Kingdom) --- uses 5 audio + 5 video descriptors, audio or video-driven selection of audio+video segment from a database. Matching of a whole sequence of audio/video by « shingling » (cf. Casey).
  • Soundspotter / Remix-TV: Fast Approximate Matching for Audio and Video Performance (Michael Casey, Mick Grierson, Goldsmiths, University of London, United Kingdom) --- Audio-driven audio+video lookup and concatenation using Casey’s SoundSpotter. Matching of a whole sequence of audio/video by « shingling ». References Michel Chion’s work about film-sound. (cf. Collins)
  • Guidage: A Fast Audio Query Guided Assemblage (Arshia Cont, UCSD / Ircam, France; Dubnov Shlomo, University of California in San Diego, United States; Gérard Assayag, Ircam, France) --- Factor-Oracle based implicit-segmentation audio-driven recomposition
  • DataJockey (Alex Norman, Xavier Amatriain): a tool to help a digital DJ select tracks based on meta-data and sound descriptors. --- This is corpus-based synthesis on a larger time-scale, that of a DJ set.


  • ScrambledHacks by Sven König --- spectral lookup alongside with ganged music video concatenation!
  • New spectral explorations by Jean-François Charles --- spectral frame mosaicing in Jitter
  • Concatenative Synthesis Using Score-Aligned Transcriptions. ICMC 2006, Roger Dannenberg's take on chroma (i.e. pitch-class) based lookup of notes or chords from classical music
  • Using Concatenative Synthesis for Expressive Performance in Jazz Saxophone. ICMC 2006, Maestre, E., Hazan, A., Ramirez, R., and Perez, A.[1]
  • Meapsoft --- java software for descriptor-based concatenative reordering


  • OStitch, Abram Hindle. -- FFT-based frequency-bin-wise mosaicing experimentation, open source OCaml code

Taxonomy up to 2005

This figure offers a comparative survey and taxonomy of the many different approaches until 2005, referenced in [1]. It shows a comparison of musical sound synthesis methods according to selection (y-axis) and analysis (x-axis), use of concatenation quality (bold), and real-time capabilities (italics)}


  1. Schwarz, D. (2006). Concatenative Sound Synthesis: The Early Years. Journal of New Music Research, 35(1), 3. Bib
  2. Maestre, E., Hazan, A., Ramirez, R., & Perez, A. (2006). Using Concatenative Synthesis for Expressive Performance in Jazz Saxophone. Paper presented at International Computer Music Conference (ICMC). Bib
  3. Schwarz, D. (2007). Corpus-Based Concatenative Synthesis. IEEE Signal Processing Magazine, 24(2), 92. Bib
  4. Schwarz, D., Britton, S., Cahen, R., & Goepfer, T. (2007). Musical Applications of Real-Time Corpus-Based Concatenative Synthesis. Paper presented at International Computer Music Conference (ICMC), Copenhagen, Denmark. Bib
  5. Schwarz, D., Cahen, R., & Britton, S. (2008). Principles and Applications of Interactive Corpus-Based Concatenative Synthesis. Paper presented at Journées d'Informatique Musicale (JIM), GMEA, Albi, France. Bib
  6. 6.0 6.1 Stoll, T. (2009). Beyond Concatenation: Some Ideas for the Creative Use of Corpus-based Sonic Material. Paper presented at International Computer Music Conference (ICMC), Montreal, QC. Bib
  7. Einbond, A., Schwarz, D., & Bresson, J. (2009). Corpus-Based Transcription as an Approach to the Compositional Control of Timbre. Paper presented at International Computer Music Conference (ICMC), Montreal, QC, Canada. Bib
  8. Maestre, E., Ramírez, R., Kersten, S., & Serra, X. (2009). Expressive Concatenative Synthesis by Reusing Samples from Real Performance Recordings. Computer Music Journal, 33(4), 23. Bib
  9. 9.0 9.1 Tremblay, P. A., & Schwarz, D. (2010). Surfing the Waves : Live Audio Mosaicing of an Electric Bass Performance as a Corpus Browsing Interface. Paper presented at Conference for New Interfaces for Musical Expression, Sydney, Australia. Bib
  10. Martin, A., Ferguson, S., & Beilharz, K. (2010). Mechanisms for Controlling Complex Sound Sources: Applications to Guitar Feedback Control. Paper presented at Conference for New Interfaces for Musical Expression, Sydney, Australia. Bib
  11. Leslie, G., Schwarz, D., Warusfel, O., Bevilacqua, F., Zamborlin, B., Jodlowski, P., & Schnell, N. (2010). Grainstick: A Collaborative, Interactive Sound Installation. Paper presented at International Computer Music Conference (ICMC), New York, NY. Bib
  12. Einbond, A., & Schwarz, D. (2010). Spatializing Timbre with Corpus-Based Concatenative Synthesis. Paper presented at International Computer Music Conference (ICMC), New York, NY. Bib
  13. Schwarz, D., & Schnell, N. (2010). A Modular Sound Descriptor Analysis Framework for Relaxed-real-time Applications. Paper presented at International Computer Music Conference (ICMC), New York, NY. Bib
  14. Schwarz, D., & Schnell, N. (2010). Descriptor-based Sound Texture Sampling. Paper presented at International Conference on Sound and Music Computing (SMC), Barcelona, Spain. Bib
  15. Stowell, D., & Plumbley, M. (2010). Timbre Remapping through a Regression-Tree Technique. Bib

Personal tools