当前位置:网站首页>Research on natural transition dubbing processing scheme based on MATLAB
Research on natural transition dubbing processing scheme based on MATLAB
2022-06-26 16:28:00 【Zhuoqing】

Jane Medium : Abstract : This article starts from modifying the background of video dubbing , Put forward the goal of natural integration and modification of audio ; By analyzing the factors of environmental sound , The processing scheme of recording impulse response and convolution with audio is determined ; Then through the actual test and MATLAB The feasibility of convolution scheme is explored , After improvement, the effect is acceptable ; Finally, a feasible flow scheme is proposed and the problems in the experiment are supplemented .
key word: Convolution , Audio processing , Impulse response
One 、 The background and objective of the problem
1. The background of the question
I am learning , stay B standing ( Bilibili) Operating personal account , Release some videos about unpacking evaluation of digital products . In the process of video production , Occasionally, you need to modify the audio of the original material , For example, to make up for a slip of the tongue 、 Correct typos and add some content . For the material recorded in the scene outside the dormitory , If you dub it directly in the dormitory , Due to different recording scenes , There is a strong sense of conflict between the picture and the sound , The transition is not natural .
2021 year 12 month , The author participated in the late editing of the class drama . In one of the scenes , The voice of the female leader is obviously less than that of the male leader , Post dubbing is required . Due to the lack of outdoor shooting conditions at that time , The post dubbing work is carried out in the dormitory , Although the recording effect is very good , But there is no outdoor background sound , When integrated into the video, the effect is poor .

▲ chart 1.1 Class play clips that need post-processing 2. The goal to achieve
Based on MATLAB Audio processing for , Make it quiet in the dormitory 、 Audio recorded without echo , can
It can be naturally integrated into other environments . In this way, the workload of later modification is reduced , It can also improve the final video
The overall look and feel of .
Two 、 Theoretical scheme analysis
1. Factors that produce environmental sound
When shooting outdoors , The environment will have a great impact on audio recording . The sound received by the microphone is in addition to the direct sound from the characters in the video , And the walls 、 Reflected sound from the ground , Noise from passing motor vehicles , There is even air flow 、 Background noise such as microphone noise .

▲ chart 2.1 Schematic diagram of microphone receiving sound during outside diameter shooting These sounds will have a considerable impact on the final recording , Can not be ignored . therefore , The sound received by the microphone is equivalent to the sound of the target sound after being processed by a specific environmental system . If you want to achieve the goal of simulating location recording in a quiet environment , It is necessary to find a way to describe the system .
2. Impulse response
Impulse response is defined as : A system under test when a pulse excitation signal is input , The obtained time domain response characteristics . In acoustic analysis , It is considered that the impulse response is the acoustic signature of a system , Contains a wealth of information about the system , Including arrival time 、 Frequency component 、 Reverberation attenuation characteristics and overall frequency response, etc . therefore , By measuring the impulse response, a system description scheme can be obtained .

▲ chart 2.2 Composition diagram of impulse response ( Source network )3. Using convolution to realize audio processing
Just mentioned , The sound received by the microphone is equivalent to the sound of the original sound after being processed by a specific environmental system . set up Y For the sound received by the microphone , X For the original sound , H Transfer functions to the system , stay s Domain has :
Y ( s ) = H ( s ) ⋅ H ( x ) Y\left( s \right) = H\left( s \right) \cdot H\left( x \right) Y(s)=H(s)⋅H(x)
Since the system transfer function is equal to the impulse response of the system , therefore H(s) It can be obtained by actually measuring the impulse response . from s The operational relationship between domain and time domain , Yes :
y ( t ) = h ( t ) ∗ x ( t ) y\left( t \right) = h\left( t \right) * x\left( t \right) y(t)=h(t)∗x(t)
therefore , Convolute the audio recorded in a quiet environment with the impulse response recorded in a specific environment , In theory, you can get audio that is similar to the recording effect in this environment , Then it can be harmoniously integrated into the materials that need to be changed .
3、 ... and 、 Test practice
1. Test plan development
(1) Selection of test scenarios
Combine the actual conditions with the ease of operation , Select the quiet recording environment inside the dormitory , The specific recording environment is selected as the bathroom . Because the bathroom is very small 、 Strong tightness , So there will be strong reverberation when recording , It's easy to experiment .

▲ chart 3.1 The specific recording environment is selected as the bathroom (2) Selection of recording device
The recording device is capacitive USB Microphone . Compared with mobile phones , The microphone has a certain noise reduction effect , The recorded audio is mono , It is convenient for subsequent experimental operation .

▲ chart 3.2 The experiment used USB Microphone (3) Test audio selection
The audio text recorded in the bathroom is “ Tsinghua University since 02 class ”, The audio text recorded in the dormitory is “ Department of automation ”, The aim is to integrate it into “ Department of automation, Tsinghua University 02 class ”. For impulse signals , After testing a series of triggering methods , Select the best sounding signal to record the impulse response of the bathroom .
2. Actual test process
(1) Recording audio in the bathroom
Record in the bathroom “ Tsinghua University since 02 class ” Audio , Name it “ bathvoice”.
(2) Measure the impulse response of the bathroom
Snap your fingers in the bathroom , Determine the impulse response in the bathroom environment , Name it “ pulse”.
(3) Recording audio in the dorm
Recording in the dormitory “ Department of automation ” Audio , Name it “ roomvoice”.
(4) Audio waveform check
use GoldWave Music software checks whether the audio is mono , The results were normal , As shown in the figure below :

▲ chart 3.3 Waveform diagram of each audio file , All are mono among , The narrow column below each sound is the overall progress bar , Not the second channel , All three audio frequencies are mono .
(5) MATLAB Convolution
Import each audio file MATLAB, Here's the picture :

▲ chart 3.4 Import MATLAB Rear audio file , Sampling rate fs by 48000take roomvoice And pulse Convolution , Name it ans1, It is the result of convolution :
>> ans1 = conv(roomvoice,pulse)
Four 、 Analysis of test results
1.ans1 Effect analysis
audition ans1, It is found that the convolution result is noisy , And the response time is very long , The effect is not perfect .
>> sound(ans1, fs)
>> audiowrite('ans1.wav', ans1, 48000
The analysis reason , use plot Command draw roomvoice、pulse and ans1 The image is as follows :



▲ chart 4.1 RoomVoice,Pulse,Ans1 wave form Observe the image , You can find roomvoice The audio is about 3.8 Second decay to close to 0,ans1 The waveform of is approaching 4 Seconds before it begins to decay ; and ans1 Before 4 Second waveform ratio roomvoice Much tighter , Not close to the weak voice 0, This results in heavy noise and long reverberation . To solve the problem , Observe the impulse response pulse, It is found that the impulse start time is not zero, resulting in transmission delay , And the reverberation attenuation slope is small 、 Large background noise leads to unsatisfactory convolution results . therefore , Next, deal with the impulse response pulse Improvement .
2. Interception of impulse response
The impulse response pulse Waveform amplification , Pictured :

▲ chart 4.2 Image after impulse response amplification Contrast map 3- Schematic diagram of impulse response composition , It is observed that the period from direct sound to the end of early attenuation only accounts for a small part of the impulse response , The rest is reverberation building and reverberation attenuation . Considering that the impact of the bathroom system on the sound mainly lies in the primary reflection 、 Wall absorption and reverberation , And the limitations of the recording equipment itself , For impulse response , Direct sound arrival shall be reserved ~ Reverberation building part , Round off the rest . This can ensure that the bathroom environment system itself accounts for most of the sound processing , Try to reduce the influence of other factors on the sound .
The treatment is : First re import pulse.m4a And named it pulse2, Double-click to open pulse2, Will start all 0 Value delete , Pictured :

▲ chart 4.3 delete pulse2 Drive enough of all 0 value Then find the area where the impulse response begins to show a small value on a large scale , Delete the following part , For example, the 22290~74304 part , For best convolution , Multiple adjustments may be required :

▲ chart 4.4 delete pulse2 The smaller value of the second half use plot The impulse response after the command is processed is shown in the figure below :

▲ chart 4.5 After processing the pulse2 The first half of the amplified waveform 3.ans2 Effect analysis
take roomvoice And pulse2 Convolution , Name it ans2, Export to wav File and listen .
>> ans2 = conv(voomvoice,pulse2)
>> audiowrite('ans2.wav', ans2, 48000)
>> sound(ans2,fs)
Convolution effect is very good , Close to recording directly in the bathroom bathvoice The effect of . plot Command to draw the waveform as follows , It can be seen that the reverberation length has been greatly improved , The noise problem has also been solved to some extent :

▲ chart 4.6 ans2 Waveform of Next use Adobe Premiere Pro Software , Yes bathvoice and ans2 Audio splicing , Simulate the modification and replacement of audio in practical application , And exported as final.mp3. In addition, we will bathvoice And unprocessed audio roomvoice Make the same splicing , Export to normal.mp3 As a contrast . final.mp3 and normal.mp3 as well as Adobe Premiere Pro Project file audio editing .prproj All have been attached to the thesis package .

▲ chart 4.7 Simulate splicing and replacement in practical application 4. The processed audio is compared with the unprocessed audio
audition final.mp3 and normal.mp3, Although the processed audio is easy to hear that it is spliced , But compared to unprocessed audio , The processed splicing part adds reverberation to simulate the bathroom environment , Give a person a kind of “ The supplementary part was also recorded in the bathroom ” The feeling of , Unprocessed audio has no such effect .
Through audio comparison , It is proved that this paper is based on MATLAB Audio processing is effective , It can reduce the sense of disobedience in the late dubbing , Achieve the effect of natural transition .
5、 ... and 、 Practical application and improvement supplement
1. A feasible process plan
During the video recording of the location , You can record one or more impulse responses in each scene . If you need to modify the dubbing later , Just re record the correct audio , Convolute with the processed impulse response of the corresponding scene , Finally, select the one closest to the original audio effect for splicing , You can achieve the effect of natural transition .
2. Supplementary explanation of the problems in the experiment
The parts that lack implementation conditions and the parts that need attention in this paper are summarized as follows :
(1) Try to choose high-quality recording equipment
The key part of this paper is the convolution of signals , If the impulse response is not received correctly , Or the noise in the late dubbing is too loud , It may seriously affect the accuracy of convoluted audio . In this experiment 100 Yuan price condenser microphone , In minimizing noise ( Close the door and close the window ) Under the premise of , Recorded in bathroom and dormitory , Although there is some bottom noise , But the effect is still ideal . If you can use a higher level microphone , Recording in a professional studio , It should have a better effect .
(2) Pay attention to the microphone distance during the later dubbing
According to the author's experience in dubbing video on weekdays , The distance between the mouth and the microphone will greatly affect the recording
The effect of . In the late dubbing , Try to keep the distance between the microphone and the recording , If conditions permit, the angle and relative position should also be consistent as far as possible , In this way, the best reduction effect can be achieved . This experiment recorded roomvoice Limited by dormitory conditions , The microphone is too close , As a result, there is still a great sense of disobedience after handling .
(3) Complete reading a sentence to realize the restoration of mood
Impulse response convolution can process audio , But it can't affect the tone . So try not to read single words in the later dubbing , Instead, read the whole sentence , Guarantee tone with the original material 、 The tone is consistent , In this way, the effect of integration is better . In this experiment roomvoice Only “ Department of automation ” Four words , Tone and original sentence “ Tsinghua University since 02 class ” Large gap , Difficult to integrate , It should be avoided in practical application .
(4) Adjust loudness and other parameters with other software
Convolution will affect the loudness of the audio , It usually shows a slight increase in loudness . When integrating splices , It can be used MATLAB To adjust the amplitude of the convoluted audio , Try to match the original audio . It can also be used. Adobe Premiere Pro And other audio and video processing software to adjust the loudness , In this experiment final.mp3 in , The gain of the three audio segments is : +3.8dB、-3.3dB and +5.9dB.

▲ chart 5.1 Adjust loudness for better integration reference :
[1] Signals and systems 2022 The fourth assignment in the spring semester , https://zhuoqing.blog.csdn.net/article/details/123550045.
[2] 26 Class play -Video-Export,https://www.bilibili.com/video/BV1da411r7uM.
[3] Explanation of acoustic concepts —— Figure out what impulse response is , https://blog.csdn.net/qq_28350219/article/details/114096751.
[4] Sound changing principle : Convolution and transfer function , https://www.csdn.net/tags/MtTaAgzsNTgzNTM1LWJsb2cO0O0O.html. [5]matlab Process audio signals 33, https://www.csdn.net/tags/OtTaAgysODM2MDUtYmxvZwO0O0OO0O0O.html.
● Related chart Links :
- chart 1.1 Class play clips that need post-processing
- chart 2.1 Schematic diagram of microphone receiving sound during outside diameter shooting
- chart 2.2 Composition diagram of impulse response ( Source network )
- chart 3.1 The specific recording environment is selected as the bathroom
- chart 3.2 The experiment used USB Microphone
- chart 3.3 Waveform diagram of each audio file , All are mono
- chart 3.4 Import MATLAB Rear audio file , Sampling rate fs by 48000
- chart 4.1 RoomVoice,Pulse,Ans1 wave form
- chart 4.2 Image after impulse response amplification
- chart 4.3 delete pulse2 Drive enough of all 0 value
- chart 4.4 delete pulse2 The smaller value of the second half
- chart 4.5 After processing the pulse2 The first half of the amplified waveform
- chart 4.6 ans2 Waveform of
- chart 4.7 Simulate splicing and replacement in practical application
- chart 5.1 Adjust loudness for better integration
边栏推荐
- Solidus labs welcomes zhaojiali, former head of financial innovation in Hong Kong, as a strategic adviser
- 若依如何实现接口限流?
- Redis order sorting command
- Cuckoo filter for Chang'an chain transaction
- Cookie和Session详解
- Arduino uno + DS1302 simple time acquisition and serial port printing
- 7 user defined loss function
- [Li Kou brush questions] 11 Container holding the most water //42 Rain water connection
- 神经网络“炼丹炉”内部构造长啥样?牛津大学博士小姐姐用论文解读
- [from deleting the database to running] the end of MySQL Foundation (the first step is to run.)
猜你喜欢

心情不好,我就这样写代码

【蓝桥杯集训100题】scratch辨别质数合数 蓝桥杯scratch比赛专项预测编程题 集训模拟练习题第15题

知道这几个命令让你掌握Shell自带工具

当一个程序员一天被打扰 10 次,后果很惊人!

长安链交易防重之布谷鸟过滤器

【毕业季】致毕业生的一句话:天高任鸟飞,海阔凭鱼跃
Scala 基础 (二):变量和数据类型

SAP OData development tutorial - from getting started to improving (including segw, rap and CDP)

【力扣刷题】单调栈:84. 柱状图中最大的矩形

1-12vmware adds SSH function
随机推荐
TCP congestion control details | 1 summary
R语言使用cor函数计算相关性矩阵进行相关性分析,使用corrgram包可视化相关性矩阵、行和列使用主成分分析重新排序、下三角形中使用平滑的拟合线和置信椭圆,上三角形中使用散点图、对角线最小值和最大值
电路中缓存的几种形式
[Blue Bridge Cup training 100 questions] scratch distinguishing prime numbers and composite numbers Blue Bridge Cup scratch competition special prediction programming question intensive training simul
Keepalived 实现 Redis AutoFailover (RedisHA)
Kept to implement redis autofailover (redisha)
Redis Guide (8): principle and implementation of Qianfan Jingfa distributed lock
安信证券排名第几位?开户安全吗?
Redis的ACID
Redis migration (recommended operation process)
IAR工程适配GD32芯片
【力扣刷题】单调栈:84. 柱状图中最大的矩形
Natural language inference with attention and fine tuning Bert pytorch
[time complexity and space complexity]
[Li Kou brush questions] 11 Container holding the most water //42 Rain water connection
股票开户优惠链接,我如何才能得到?在线开户安全么?
How to separate jar packages and resource files according to packaging?
Ten thousand words! In depth analysis of the development trend of multi-party data collaborative application and privacy computing under the data security law
[from database deletion to running] JDBC conclusion (finish the series in one day!! run as soon as you finish learning!)
Ideal path problem