当前位置:网站首页>[speech discrimination] discrimination of speech signals based on MATLAB double threshold method [including Matlab source code 1720]
[speech discrimination] discrimination of speech signals based on MATLAB double threshold method [including Matlab source code 1720]
2022-06-25 06:20:00 【Purple extreme divine light】
One 、 How to get the code
How to get the code 1:
The complete code has been uploaded to my resources :【 Speech discrimination 】 be based on matlab The double threshold method is used to distinguish speech signals 【 contain Matlab Source code 1720 period 】
How to get the code 2:
By subscribing to Ziji Shenguang blog Paid column , With proof of payment , Private Blogger , This code is available .
remarks :
Subscribe to Ziji Shenguang blog Paid column , Free access to 1 Copy code ( The period of validity From the Subscription Date , Valid for three days );
Two 、 Brief introduction to short-term energy
1 Basic concepts
The double threshold method was originally proposed based on short-term average energy and short-term average zero crossing rate , The principle is that there are vowels in Chinese vowels , More energy , So we can find the finals from the short-term average energy , And the initials are consonants , They have a higher frequency , The corresponding short-time average zero crossing rate is large , So use these two features to find out the initials and finals , It is equal to finding out the complete Chinese syllable . The double threshold method is realized by two-level decision .
2 First class judgment
① According to a higher threshold selected on the speech short-term energy envelope ( threshold )T2( The figure is represented by a virtual horizontal line ) Make a rough judgment , Is higher than this T2 The threshold must be voice ( That is to say CD There must be voice between the paragraphs ), The starting and ending point of speech should be outside the time point corresponding to the intersection of the threshold and the short-term energy envelope ( That is to say CD Beyond the paragraph ).
② Determine a lower threshold on the average energy ( threshold )T( The figure is represented by a solid horizontal line ), And from C Point to left 、 from D Click right to search , Find the short-time energy envelope and threshold respectively T: Two intersecting points B and E, therefore BE Segment is the starting and ending position of the speech segment determined by the double threshold method according to the short-term energy .
3 Second judgment
Subject to the short-term average zero crossing rate , from B Point left and from E Click right to search , Find that the short-time average zero crossing rate is lower than a certain threshold ( threshold )T3 Two important points A and F( In the figure T3 Indicated by horizontal dotted line ), This is the beginning and end of the voice segment .
According to these two levels , The starting point of the speech is found A And end point position F. However, considering that there will be a minimum length in the mute area between words during pronunciation to indicate the pause between sounds , Is below the threshold T3 The end of the speech segment is judged only after such a minimum length is met , In fact, it is equivalent to extending the length of the final sound , Pictured 6-1-1 The starting and ending points of the voice indicated on the voice waveform diagram in are A and F+( It can be seen from the figure that the ending point is F, But in the actual treatment, it is extended to F+).
In the specific operation of endpoint detection , The first is to frame the speech ( The first 2 Chapter has been introduced ), The short-time average energy and short-time average zero crossing rate can be calculated on the basis of framing , Then compare and judge according to the threshold value frame by frame .
3、 ... and 、 Partial source code
clear all; clc; close all;
filedir=[]; % specify the path to a file
filename='s.wav'; % Specify a filename
fle=[filedir filename] % The string that makes up the path and file name
[x,fs]=audioread(fle); % Read in the data file
x=x/max(abs(x)); % Amplitude normalization
N=length(x); % Take the signal length
time=(0:N-1)/fs; % computing time
pos = get(gcf,'Position'); % Make a picture
set(gcf,'Position',[pos(1), pos(2)-100,pos(3),(pos(4)-200)]);
plot(time,x,'k');
title(' Male voice “ The blue sky , White clouds , The green sea ” Endpoint detection ');
ylabel(' amplitude '); axis([0 max(time) -1 1]); grid;
xlabel(' Time /s');
wlen=200; inc=80; % Framing parameters
IS=0.1; overlap=wlen-inc; % Set up IS
NIS=fix((IS*fs-wlen)/inc +1); % Calculation NIS
fn=fix((N-wlen)/inc)+1; % Find the number of frames
frameTime=frame2time(fn, wlen, inc, fs);% Calculate the time corresponding to each frame
[voiceseg,vsl,SF,NF]=vad_ezm1(x,wlen,inc,NIS); % Endpoint detection
Four 、 Running results

5、 ... and 、matlab Edition and references
1 matlab edition
2014a
2 reference
[1] Han Jiqing , Zhang Lei , Zheng tieran . Voice signal processing ( The first 3 edition )[M]. tsinghua university press ,2019.
[2] Liu ruobian . Deep learning : Speech recognition technology practice [M]. tsinghua university press ,2019.
[3] Song Yunfei , Jiang zhancai , Wei Zhonghua . be based on MATLAB GUI Voice processing interface design [J]. Technology Information . 2013,(02)
边栏推荐
- Go uses channel to control concurrency
- SAP ui5 beginner tutorial No. 28 - Introduction to the integration test tool OPA for SAP ui5 applications
- An interview question record about where in MySQL
- Monitoring access: how to grant minimum WMI access to the monitoring service account
- IQ debugging of Hisilicon platform ISP and image (1)
- Rational investment and internationalism
- Distributed solar photovoltaic inverter monitoring
- Ethernet
- What happens when redis runs out of memory
- MySQL uses the where condition to find strange results: solve
猜你喜欢
![[Suanli network] problems and challenges faced by the development of Suanli network](/img/90/1d537de057113e2b4754e76746f256.jpg)
[Suanli network] problems and challenges faced by the development of Suanli network

After five years of software testing in ByteDance, I was dismissed in December to remind my brother of paddling
Summary of 6 common methods of visual deep learning model architecture
Technology inventory: Technology Evolution and Future Trend Outlook of cloud native Middleware
![[Suanli network] technological innovation of Suanli Network -- Key Technologies of green and security](/img/52/7dedc5b6e213839fbf5cee3963ac99.jpg)
[Suanli network] technological innovation of Suanli Network -- Key Technologies of green and security
![[open source sharing] deeply study KVM, CEPH, fuse features, including open source projects, code cases, articles, videos, architecture brain maps, etc](/img/9d/9bcf52f521e92cf97eb1d545931c68.jpg)
[open source sharing] deeply study KVM, CEPH, fuse features, including open source projects, code cases, articles, videos, architecture brain maps, etc
Part 34 of SAP ui5 application development tutorial - device adaptation of SAP ui5 application based on device type
SAP ui5 beginner tutorial No. 27 - unit test tool quNit introduction trial version for SAP ui5 application

Mongodb basic concept learning - Documentation
Trial version of routing history and routing back and history of SAP ui5
随机推荐
PHP output (print) log to TXT text
Wechat applet simply realizes chat room function
RM command – remove file or directory
Summary of 6 common methods of visual deep learning model architecture
MySQL uses the where condition to find strange results: solve
MySQL tuning -- 02 -- slow query log
[open source sharing] deeply study KVM, CEPH, fuse features, including open source projects, code cases, articles, videos, architecture brain maps, etc
An easy problem
[golang] leetcode intermediate - Search rotation sort array & search two-dimensional matrix II
Day21 JMeter usage basis
Vegetables sklearn - xgboost (2)
Netstat command – displays network status
Tencent and China Mobile continued to buy back with large sums of money, and the leading Hong Kong stocks "led" the market to rebound?
JS implementation mouse can achieve the effect of left and right scrolling
What does cardinality mean in set
What happens when redis runs out of memory
[kicad image] download and installation
[Suanli network] problems and challenges faced by the development of Suanli network
Gb28181 protocol -- timing
C simple operation mongodb