Sound Texture Synthesis
Documentation and examples of environmental sound texture synthesis by Corpus-Based Concatenative Synthesis (CBCS)
The synthesis of credible environmental sound textures such as wind, rain, fire, crowds, traffic noise, is a crucial component for many applications in computer games, installations, audio--visual production, cinema. Often, sound textures are part of the soundscape of a long scene, and in interactive applications such as games and installations, the length of the scene is not determined in advance. Therefore, it is advantageous to be able to play a given texture for an arbitrary amount of time, but simple looping would introduce repetition that is easy to pick out. Using very long loops, or layering several loops can avoid this problem (and is the way sound designers currently do this), but this stipulates that a long enough recording of a stable environmental texture is available, and uses up a lot of media and memory space.
Overview of Sound Texture Synthesis
A state-of-the-art overview on sound texture synthesis in general can be found in this article:
Current research has been going on in the PHYSIS project. Within the project, the partners tried different methods to extend a given sound texture recording for an infinite amount of time, to avoid needing to resort to looping.
Examples from 4 new algorithms developed within the project can be heard in a listenting test evaluating several texture synthesis algorithms.
Smooth Granular Sound Texture Expansion by Control of Timbral Similarity
This algorithm extends a given environmental texture recording for an indefinite amount of time by playing grains while avoiding repetition and artefacts such as timbral discontinuities with the help of audio descriptors.
Diemo Schwarz, Sean O 'Leary. Smooth Granular Sound Texture Synthesis by Control of Timbral Similarity. Sound and Music Computing (SMC), Jul 2015, Maynooth, Ireland. 12th Sound and Music Computing Conference, pp.6, 2015.
Interactive High-Level Control of Sound Textures
These articles present a way to make environmental recordings controllable again by the use of continuous annotations of the high-level semantic parameter one wishes to control, e.g. wind strength or crowd excitation level. A partial annotation can be propagated to cover the entire recording via cross-modal analysis between gesture and sound by canonical time warping (CTW).
Diemo Schwarz, Baptiste Caramiaux. Interactive Sound Texture Synthesis Through Semi-Automatic User Annotations. Springer International Publishing; Aramaki, M., Derrien, O., Kronland-Martinet, R., Ystad, S. Sound, Music, and Motion, Lecture Notes in Computer Science, Vol. 8905, pp.372-392, 2014.
- Original wind sound (recording by Roland Cahen)
- Resynthesis via annotation going from min to max and back to min
Interactive Descriptor-Based Synthesis of Environmental Sound Textures
These examples show control of the sound character of the textures by navigation through a descriptor space. Descriptors are automatically extracted audio features like brilliance or periodicity, synthesis is granular with long grains and large overlap.