当前位置:网站首页>[speech processing] speech signal denoising and denoising based on MATLAB low-pass filter [including Matlab source code 1709]
[speech processing] speech signal denoising and denoising based on MATLAB low-pass filter [including Matlab source code 1709]
2022-07-05 22:46:00 【Purple light】
One 、 How to get the code
How to get the code 1:
The complete code has been uploaded to my resources :【 Voice Processing 】 be based on matlab Low pass filter speech signal denoising and denoising 【 contain Matlab Source code 1709 period 】
How to get the code 2:
By subscribing to Ziji Shenguang blog Paid column , With proof of payment , Private Blogger , This code is available .
remarks :
Subscribe to Ziji Shenguang blog Paid column , Free access to 1 Copy code ( The period of validity From the Subscription Date , Valid for three days );
Two 、 Introduction to speech processing ( Attached course assignment report )
1 Characteristics of voice signal
Through the observation and analysis of a large number of speech signals, it is found that , Voice signal mainly has the following two characteristics :
① In the frequency domain , The spectral components of speech signals are mainly concentrated in 300~3400Hz Within the scope of . Take advantage of this feature , An anti aliasing band-pass filter can be used to extract the frequency components of the speech signal in this range , Then press 8kHz The voice signal is sampled at the sampling rate of , You can get discrete speech signals .
② In the time domain , The voice signal has “ Short term ” Characteristics , That is, in general , The characteristics of speech signal change with time , But in a short time interval , The voice signal remains stable . In the voiced segment, it shows the characteristics of periodic signal , In the unvoiced segment, it shows the characteristics of random noise .
2 Voice signal acquisition
Before digitizing the voice signal , Anti aliasing pre filtering must be carried out first , There are two purposes of pre filtering :① Suppress the frequency exceeding in each domain component of input signal guidance fs/2 All the components of (fs Is the sampling frequency ), To prevent aliasing interference .② Inhibition 50Hz Power frequency interference of power supply . such , The prefilter must be a bandpass filter , Set on it 、 The lower cut-off color ratio is fH and fL, For most speech codecs ,fH=3400Hz、fL=60~100Hz、 The sampling rate is fs=8kHz; For Ding speech recognition , When used for telephone users , The index is the same as that of speech codec . When used in occasions with high or high requirements fH=4500Hz or 8000Hz、fL=60Hz、fs=10kHz or 20kHz.
In order to change the original analog voice signal into digital signal , It must go through two steps of sampling and quantization , Thus, the digital speech signal with discrete time and amplitude is obtained . Sampling is also called sampling , Is the discretization of the signal in time , That is, according to a certain time interval △t In analog signal x(t) Take its instantaneous value point by point . The Nyquist theorem must be satisfied when sampling , Sampling frequency fs Sampling must be carried out at a speed more than twice the maximum frequency of the signal under test , It is realized by multiplying the sampling pulse and the analog signal .
In the process of sampling, attention should be paid to the selection of sampling interval and signal confusion : The sampling interval of the analog signal should be determined first . How to choose △t Many technical factors need to be considered . generally speaking , The higher the sampling frequency , The denser the number of sampling points , The resulting discrete signal is closer to the original signal . But too high sampling frequency is not desirable , For fixed length (T) The signal of , Excessive amount of data collected (N=T/△t), Add unnecessary calculation workload and storage space to the computer ; If the amount of data (N) limit , The sampling time is too short , It will result in some data information being excluded . The sampling frequency is too low , Sampling points are too far apart , Then the discrete signal is not enough to reflect the waveform characteristics of the original signal , The signal cannot be restored , Cause signal confusion . According to the sampling theorem , When the sampling frequency is greater than twice the bandwidth of the signal , The sampling process does not lose information , The original signal waveform can be reconstructed without distortion from the sampled signal by using the ideal filter . Quantization is the discretization of amplitude , That is, the vibration amplitude is expressed by binary quantization level . The quantization level changes in series , The actual vibration value is a continuous physical quantity . The specific vibration value is rounded to the nearest quantization level .
The speech signal is pre filtered and sampled , from A/D The converter is transformed into a two address digital code . This anti aliasing filter is usually made in an integrated block with analog-to-digital converter , So for now , The digital quality of voice signal is guaranteed .
After collecting the voice signal , The voice signal needs to be analyzed , Such as time domain analysis of speech signal 、 Spectrum analysis 、 Spectrogram analysis and noise filtering .
3 Speech signal analysis technology
Speech signal analysis is the premise and foundation of speech signal processing , Only by analyzing the parameters that can represent the essential characteristics of speech signal , It is possible to use these parameters for efficient voice communication 、 Speech synthesis and speech recognition [8]. and , The sound quality of speech synthesis is good or bad , The level of speech recognition rate , It also depends on the accuracy and accuracy of the speech signal bridge . Therefore, speech signal analysis plays an important role in the application of speech signal processing .
Throughout the whole process of speech analysis is “ Short term analysis technology ”. because , As a whole, the characteristics of speech signal and the parameters characterizing its essential characteristics change with time , So it's an unsteady process , It cannot be analyzed and processed with digital signal processing technology for processing unstable signals . however , Because different speech is the response of a certain shape of the vocal tract formed by the movement of human oral muscles , This kind of oral muscle movement is very slow relative to speech frequency , So on the other hand , Although the phonetic multiple sign has time-varying characteristics , But in a short time range ( It is generally believed that in 10~30ms In a short time ), Its characteristics remain basically unchanged, that is, relatively stable , Because it can be regarded as a quasi steady state process , That is, the speech signal has short-term stability . Therefore, any speech signal analysis and processing must be based on “ short-term ” On the basis of . That is to say “ Short term analysis ”, The speech signal is divided into segments to analyze its characteristic parameters , Each paragraph is called a “ frame ”, The frame length is generally taken as 10~30ms. such , For the overall voice signal , The time series of characteristic parameters composed of characteristic parameters of each frame is analyzed .
According to the different properties of the analyzed parameters , Speech signal analysis can be divided into time domain analysis 、 Frequency domain analysis 、 Inverted domain analysis, etc ; The time domain analysis method is simple 、 A small amount of calculation 、 Clear physical meaning and other advantages , However, because the most important perceptual characteristics of speech signal are reflected in the power spectrum , The phase change only plays a small role , Therefore, compared with time domain analysis, frequency domain analysis is more important .
4 Time domain analysis of speech signal
The time domain analysis of speech signal is to analyze and extract the time domain parameters of speech signal . When performing speech analysis , The first and most intuitive thing is its time domain waveform . Speech signal itself is time domain signal , Therefore, time domain analysis is the earliest use , It is also the most widely used analysis method , This method directly uses the time-domain waveform of speech signal . Time domain analysis is usually used for the most basic parameter analysis and application , Such as speech segmentation 、 Preprocessing 、 Large classification, etc . The characteristics of this analysis method are :① It means that the voice signal is more intuitive 、 The physical meaning is clear .② It's easy to implement 、 Less computation .③ Some important parameters of speech can be obtained .④ Only use general equipment such as oscilloscope , Easy to use, etc .
The time domain parameters of speech signal have short-term energy 、 Short time zero crossing rate 、 Short time white correlation function and short time average amplitude difference function, etc , This is a set of basic short-time parameters of speech signal , It should be applied in various speech signal digital processing technologies [6]. The square window or Hamming window is generally used in calculating these parameters .
5 Frequency domain analysis of speech signal
The frequency domain analysis of speech signal is to analyze the frequency domain characteristics of speech signal . In a broad sense , The frequency domain analysis of speech signal includes the spectrum of speech signal 、 Power spectrum 、 Cepstrum 、 Spectrum envelope analysis, etc , The commonly used frequency domain analysis method is band-pass filter bank method 、 Fourier transform method 、 Line prediction method, etc .
3、 ... and 、 Partial source code
% Do the original time domain waveform analysis and spectrum analysis of the language signal
[y,fs,bits]=wavread('6.wav');
fs
sound(y,fs) % Playback voice signal
pause(19);
n=length(y) % Select the number of points to transform
y_p=fft(y,n); % Yes n Points are Fourier transformed to the frequency domain
f=fs*(0:n/2-1)/n; % The frequency of the corresponding point
figure(1)
subplot(2,1,1);
plot(y); % Time domain waveform of speech signal
title(' The time domain waveform of the original speech signal after sampling ');
xlabel(' time axis ')
ylabel(' amplitude A')
subplot(2,1,2);
plot(f,abs(y_p(1:n/2))); % Spectrum diagram of speech signal
title(' Spectrum diagram of original speech signal after sampling ');
xlabel(' frequency Hz');
ylabel(' Frequency amplitude ');
% Generate noise to audio signal
L=length(y) % Calculate the length of the audio signal
noise=0.1*randn(L,1); % Generate random noise signals of equal length ( The size of the noise here depends on the amplitude multiple of the random function )
y_z=y+noise; % Superimpose the two signals into a new signal —— Add noise treatment
sound(y_z,fs)
pause(19)
% Analyze the noisy speech signal
n=length(y); % Select the number of points to transform
y_zp=fft(y_z,n); % Yes n Points are Fourier transformed to the frequency domain
f=fs*(0:n/2-1)/n; % The frequency of the corresponding point
figure(2)
subplot(2,1,1);
plot(y_z); % Time domain waveform of noisy speech signal
title(' Time domain waveform of noisy speech signal ');
xlabel(' time axis ')
ylabel(' amplitude A')
subplot(2,1,2);
plot(f,abs(y_zp(1:n/2))); % Spectrum of noisy speech signal
title(' Spectrum diagram of noisy speech signal ');
xlabel(' frequency Hz');
ylabel(' Frequency amplitude ');
% The denoising procedure for the noisy speech signal is as follows :
fp=1500;fc=1700;As=100;Ap=1;
figure(4);
subplot(2,2,1);
plot(f,abs(y_zp(1:n/2)));
title(' The spectrum of the signal before filtering ');
subplot(2,2,2);
plot(f,abs(X(1:n/2)));
title(' Filtered signal spectrum ');
subplot(2,2,3);
plot(y_z);
title(' The waveform of the signal before filtering ')
subplot(2,2,4);
plot(x);
title(' The waveform of the filtered signal ')
sound(x,fs,bits) % Play back the filtered audio
3、 ... and 、 Running results
5、 ... and 、matlab Edition and references
1 matlab edition
2014a
2 reference
[1] Han Jiqing , Zhang Lei , Zheng tieran . Voice signal processing ( The first 3 edition )[M]. tsinghua university press ,2019.
[2] Liu ruobian . Deep learning : Speech recognition technology practice [M]. tsinghua university press ,2019.
[3] Song Yunfei , Jiang zhancai , Wei Zhonghua . be based on MATLAB GUI Voice processing interface design [J]. Technology Information . 2013,(02)
边栏推荐
- Record several frequently asked questions (202207)
- FBO and RBO disappeared in webgpu
- Roman numeral to integer
- VOT toolkit environment configuration and use
- How to create a thread
- Ieventsystemhandler event interface
- audiopolicy
- BFC block level formatting context
- Metasploit (MSF) uses MS17_ 010 (eternal blue) encoding:: undefined conversionerror problem
- 119. Pascal‘s Triangle II. Sol
猜你喜欢
Overview of Fourier analysis
Nangou Gili hard Kai font TTF Download with installation tutorial
关于MySQL的30条优化技巧,超实用
I closed the open source project alinesno cloud service
航海日答题小程序之航海知识竞赛初赛
Google Maps case
Golang writes the opening chapter of selenium framework
【无标题】
Paddle Serving v0.9.0 重磅发布多机多卡分布式推理框架
All expansion and collapse of a-tree
随机推荐
IIC bus realizes client device
Tiktok__ ac_ signature
TCC of distributed solutions
二叉树(二)——堆的代码实现
Google Maps case
[error record] file search strategy in groovy project (src/main/groovy/script.groovy needs to be used in the main function | groovy script directly uses the relative path of code)
119. Pascal‘s Triangle II. Sol
Roman numeral to integer
关于MySQL的30条优化技巧,超实用
Usage Summary of scriptable object in unity
分布式解决方案之TCC
记录几个常见问题(202207)
2022.02.13 - SX10-30. Home raiding II
Depth first DFS and breadth first BFS -- traversing adjacency tables
344. Reverse String. Sol
The introduction to go language is very simple: String
Starting from 1.5, build a micro Service Framework -- log tracking traceid
基于STM32的ADC采样序列频谱分析
Activate function and its gradient
鏈錶之雙指針(快慢指針,先後指針,首尾指針)