当前位置:网站首页>Using tensorflow to realize voiceprint recognition
Using tensorflow to realize voiceprint recognition
2022-07-02 23:55:00 【TomCruisePro】
Preface
This paper introduces the use of tensorflow Implement a simple voiceprint recognition model , First, you need to be familiar with audio classification , If you don't know, you can check my last article - be based on tensorflow Realize sound classification , On this basis , We train a voiceprint recognition model , Through this model, we can identify who the speaker is , It can be applied to some audio verification projects . The difference is that this project uses ArcFace loss,ArcFace loss:Additive Angular Margin Loss( Additive angular interval loss function ), Normalize the feature vector and weight , Yes θ Plus angular spacing , The influence of angle interval on angle is more direct than cosine interval .
Usage environment
python3.8
tensorflow2.3.0
Model download
https://download.csdn.net/download/qq_33200967/20368421
install
1.pip install tensorflow==2.3.0 -i https://mirrors.aliyun.com/pypi/simple/
2.pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/
Create data
1. This niuma uses a Chinese speech corpus , This dataset has 3342 Personal voice data , Yes 1130000+ Voice data . If the reader has other better data sets , Can be mixed together , Use python Tool module of aukit Processing audio 、 Noise reduction and mute removal
2. First create a data list ,** The format of the data list is < Voice file path / Voice tag >, Creating this list is mainly It is convenient to read later , It is also convenient to read other voice data sets , Voice classification tags are the unique identification of different people id, Different voice data sets , You can write the corresponding function to generate the data list , Write these data sets in the data list
3. stay create_data.py Write down the code in , Because Chinese speech corpus This data set is mp3 Format , Ben niuma found that the reading speed of this format is very slow , So I put all mp3 Format of audio conversion to wav Format , After creating the data list , Some data may be wrong , So we need to check , Delete the wrong data . Execute the following procedure to complete data preparation
import json
import os
from pydub import AudioSegment
from tqdm import tqdm
from utils.reader import load_audio
# Generate data list
def get_data_list(infodata_path, list_path, zhvoice_path):
with open(infodata_path, 'r', encoding='utf-8') as f:
lines = f.readlines()
f_train = open(os.path.join(list_path, 'train_list.txt'), 'w')
f_test = open(os.path.join(list_path, 'test_list.txt'), 'w')
sound_sum = 0
speakers = []
speakers_dict = {
}
for line in tqdm(lines):
line = json.loads(line.replace('\n', ''))
duration_ms = line['duration_ms']
if duration_ms < 1300:
continue
speaker = line['speaker']
if speaker not in speakers:
speakers_dict[speaker] = len(speakers)
speakers.append(speaker)
label = speakers_dict[speaker]
sound_path = os.path.join(zhvoice_path, line['index'])
save_path = "%s.wav" % sound_path[:-4]
if not os.path.exists(save_path):
try:
wav = AudioSegment.from_mp3(sound_path)
wav.export(save_path, format="wav")
os.remove(sound_path)
except Exception as e:
print(' Data error :%s, Information :%s' % (sound_path, e))
continue
if sound_sum % 200 == 0:
f_test.write('%s\t%d\n' % (save_path.replace('\\', '/'), label))
else:
f_train.write('%s\t%d\n' % (save_path.replace('\\', '/'), label))
sound_sum += 1
f_test.close()
f_train.close()
# Delete error audio
def remove_error_audio(data_list_path):
with open(data_list_path, 'r', encoding='utf-8') as f:
lines = f.readlines()
lines1 = []
for line in tqdm(lines):
audio_path, _ = line.split('\t')
try:
spec_mag = load_audio(audio_path)
lines1.append(line)
except Exception as e:
print(audio_path)
print(e)
with open(data_list_path, 'w', encoding='utf-8') as f:
for line in lines1:
f.write(line)
if __name__ == '__main__':
get_data_list('dataset/zhvoice/text/infodata.json', 'dataset', 'dataset/zhvoice')
remove_error_audio('dataset/train_list.txt')
remove_error_audio('dataset/test_list.txt')
After executing the above code , The following data formats will be generated ,, If you want to customize data , Refer to the following data list , The front is the relative path of audio , Behind it is the label of the speaker , for example
dataset/zhvoice/zhmagicdata/5_895/5_895_20170614203758.wav 3238
data fetch
边栏推荐
- 判断二叉树是否为满二叉树
- Where is the win11 automatic shutdown setting? Two methods of setting automatic shutdown in win11
- How to apply for company email when registering in company email format?
- 开发知识点
- Returns the size of the largest binary search subtree in a binary tree
- MFC file operation
- Leetcode DP three step problem
- A single element in an ordered array -- Valentine's Day mental problems
- 95 pages of smart education solutions 2022
- 95页智慧教育解决方案2022
猜你喜欢
![[shutter] shutter open source project reference](/img/3f/b1d4edd8f8e8fd8e6b39548448270d.jpg)
[shutter] shutter open source project reference

C MVC creates a view to get rid of the influence of layout

【ML】李宏毅三:梯度下降&分类(高斯分布)

Convolution和Batch normalization的融合

附加:token;(没写完,别看…)

Container runtime analysis

Convolution和Batch normalization的融合

MySQL Foundation

Detailed explanation of 'viewpager' in compose | developer said · dtalk

Bean加载控制
随机推荐
How does win11 turn on visual control? Win11 method of turning on visual control
Fudian bank completes the digital upgrade | oceanbase database helps to layout the distributed architecture of the middle office
CADD course learning (4) -- obtaining proteins without crystal structure (Swiss model)
Judge whether the binary tree is full binary tree
接口自动化覆盖率统计——Jacoco使用
YOLOX加强特征提取网络Panet分析
返回二叉树两个节点间的最大距离
Use of cocospods
Linux 下安装 redis
How can cross-border e-commerce achieve low-cost and steady growth by laying a good data base
附加:token;(没写完,别看…)
ArrayList分析2 :Itr、ListIterator以及SubList中的坑
Yolox enhanced feature extraction network panet analysis
JDBC練習案例
MFC 获取当前时间
接口差异测试——Diffy工具
1380. Lucky numbers in the matrix
Interface automation coverage statistics - used by Jacobo
Interface switching based on pyqt5 toolbar button -2
[error record] the flutter reports an error (could not resolve io.flutter:flutter_embedding_debug:1.0.0.)