« back  next »

FoleySonic is a tool for gesture controlled sound design, using off the shelf motion-controllers, such as the Nintendo Wii-Remote or the Playstation Move Controller.

Sound designers and Foley artists have long struggled to create expressive soundscapes using standard editing software, devoting much time for the calibration of multiple sound samples and parameter adjustments.

Our approach allows the user to control digital sound generators with a set of motion gestures that resemble interaction with physical sounding objects. The user is provided with gesture-controlled sound toolbox that exhibits a behavior which is largely similar to interacting with familiar physical sounding objects from the classic Foley toolbox while being flexible enough to create a wide spectrum of sounds. Familiar mind models allow for an intuitive interaction with the sound.

Rather than requiring profound technical knowledge of sound design, the system leverages the user’s motor memory and motion skills to mimic generic and familiar interactions with everyday sounding objects. This allows the user to fully focus on the expressive act of sound creation while enjoying a fluent workflow and a satisfying user experience.

The system has been presented at the 128th AES Convention and the 2011 ACM TEI-Conference.

FoleySonic also won the 1st price in the new-media category at the “Campusideen” business plan competition 2011, an annual business award given out by a communal business development bank.

FoleySonic: Motion data is captured from the controller and used to interact with a mind model resembling a familiar physical sounding object. Custom sound generators or standard VST-instruments can be intuitively controlled by simple motion gestures.


The task of sound placement on video timelines is usually a time-consuming process that requires the sound designer or foley artist to carefully calibrate the position and length of each sound sample. For novice and home video producers, friendlier and more entertaining input methods are needed. We demonstrate a novel approach that harnesses the motion-sensing capabilities of readily available input devices, such as the Nintendo Wii Remote or mod-ern smart phones, to provide intuitive and fluid arrangement of samples on a timeline. Users can watch a video while simultaneously adding sound effects, providing a near real-time workflow. The system leverages the user’s motor skills for enhanced expressiveness and provides a satisfying experience while accelerating the process.