Deep learning turns mono recordings into immersive sound

Having learned this, it is able to watch a video and then distort a monaural recording in a way that simulates where the sound ought to be coming from..“We call the resulting output 2.5D visual sound—the visual stream helps ‘lift’ the flat single channel audio into spatialized sound,” say Grauman and Gao..The results are impressive..You can watch a video of their work here—be sure to wear headphones while you’re watching..The video compares the results of 2.5D recordings with monaural recording and shows how good it can be..“The predicted 2.5D visual sound offers a more immersive audio experience,” say Grauman and Gao..However, it doesn’t produce full 3D sound because of the reasons mentioned above—the researchers don’t create a personalized head-related transfer function..And there are some situations the algorithm finds difficult to deal with..Obviously, the system cannot deal with any sound source that is not visible in the video..Neither can it deal with sound sources it has not been trained to recognize..This system is focused mainly on music videos..Nevertheless, Grauman and Gao have a clever idea that works well for many music videos.. More details

Leave a Reply