当前位置:网站首页>Meta enlarge again! VR new model posted on CVPR oral: read and understand voice like a human

Meta enlarge again! VR new model posted on CVPR oral: read and understand voice like a human

2022-07-01 13:04:00 Zhiyuan community

Meta This research mainly includes three models , They are visual and acoustic matching models (Visual Acoustic Matching model)、 Vision based de reverberation model (Visually-Informed Dereverberation)、 Audio video separation model (Visual Voice).

First , The visual acoustic matching model can transform the audio in the video into the acoustic effect in the target environment , Given the image of the target environment and the waveform of the source audio , The model can then re synthesize the audio to match the acoustics of the target room .
 
Then there is our vision based audio de reverberation model (VIDA), It can learn to eliminate reverberation according to the observed sound and visual scene .
 
and Visual Voice The model can separate the audio and video in the video across the model .
Address of thesis : https://arxiv.org/pdf/2202.06875.pdf
 
原网站

版权声明
本文为[Zhiyuan community]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/182/202207011244467056.html