当前位置:网站首页>What kind of experience is it to develop a "grandson" who will call himself "Grandpa"?
What kind of experience is it to develop a "grandson" who will call himself "Grandpa"?
2022-06-30 09:05:00 【Have a promising future】
Living alone is boring , If there's anything perfect, just chat with me …
“ Living alone is boring , If there's anything perfect, just chat with me ”, Based on this unique idea , I , Decide to make something perfect , Give it the ability to read specified words .
At present, there are two products on the market that can better realize voice related functions , They are Baidu speech recognition and iFLYTEK speech recognition , Among the two, I choose iFLYTEK . If Python、Node.js、C#、C++、PHP As your development language , Baidu speech recognition can find relevant documents . If the developed speech recognition is equipped with HarmonyOS On the system , IFLYTEK can be selected . Both have their own advantages 、 To be short .
Before operation, you need to go to the official website to download voice related demo,demo There are the necessary resources for us to integrate voice technology .
After downloading , take assets、libs Copy the folder to your own project , stay AndroidManifest.xml Statically declare partial permissions .
<!-- Connection network permission , Used to perform cloud voice capabilities -->
<uses-permission android:name="android.permission.INTERNET"/>
<!-- Access to the mobile phone recorder , dictation 、 distinguish 、 Semantic understanding requires this permission -->
<uses-permission android:name="android.permission.RECORD_AUDIO"/>
<!-- Read network information status -->
<uses-permission android:name="android.permission.ACCESS_NETWORK_STATE"/>
<!-- Get current wifi state -->
<uses-permission android:name="android.permission.ACCESS_WIFI_STATE"/>
<!-- Allow programs to change network connection status -->
<uses-permission android:name="android.permission.CHANGE_NETWORK_STATE"/>
<!-- Access to mobile phone information -->
<uses-permission android:name="android.permission.READ_PHONE_STATE"/>
<!-- External storage write permission , This permission is required to build syntax -->
<uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>
<!-- External storage read permission , This permission is required to build syntax -->
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"/>
<!-- Configure permissions , Used to record application configuration information -->
<uses-permission android:name="android.permission.WRITE_SETTINGS"/>
<!-- Mobile location information , It is used to provide positioning for functions such as semantics , Provide more accurate services -->
<!-- Location information is sensitive information , It can be done by Setting.setLocationEnable(false) Close location request -->
<uses-permission android:name="android.permission.ACCESS_FINE_LOCATION"/>
Then dynamically apply for dangerous permission :
/** * android 6.0 You need to apply for recording audio dynamically 、 Write external storage permissions */
private void initPermission() {
String permissions[] = {
Manifest.permission.RECORD_AUDIO,
Manifest.permission.WRITE_EXTERNAL_STORAGE
};
ArrayList<String> toApplyList = new ArrayList<String>();
for (String perm : permissions) {
if (PackageManager.PERMISSION_GRANTED != ContextCompat.checkSelfPermission(this,
perm)) {
toApplyList.add(perm);
}
}
String tmpList[] = new String[toApplyList.size()];
if (!toApplyList.isEmpty()) {
ActivityCompat.requestPermissions(this, toApplyList.toArray(tmpList), 123);
}
}
The use of iFLYTEK speech recognition requires initialization , Initialization creates a voice configuration object , Can only be used after initialization MSC Our services . It is recommended to put initialization at the program entrance ( Such as Application、Activity Of onCreate Method ).
// take “=” Replace the following string with the one you applied for APPID, To apply for the address :http://www.xfyun.cn
// Please do not leave “=” And appid Add any empty characters or escape characters between
SpeechUtility.createUtility(this, "appid=" + getString(R.string.app_id));
Realize voice recognition monitoring
/** * Speech recognition listener */
private RecognizerDialogListener mRecognizerDialogListener = new RecognizerDialogListener() {
public void onResult(RecognizerResult results, boolean isLast) {
if (!isLast) {
// Recognition result
}
}
// Identify callback errors .
public void onError(SpeechError error) {
Toast.makeText(MainActivity.this, error.getPlainDescription(true),
Toast.LENGTH_SHORT).show();
}
};
Then there is call Grandson
public void call() {
// Use SpeechRecognizer object , You can customize the interface according to the callback message ;
mIat = SpeechRecognizer.createRecognizer(this, mInitListener);
if (null == mIat) {
// Failed to create the singleton , And 21001 The error is for the same reason , Reference resources http://bbs.xfyun.cn/forum.php?mod=viewthread&tid=9688
Toast.makeText(this, " Failed to create object , Please make sure the libmsc.so Place correctly , And there are calls createUtility To initialize ",
Toast.LENGTH_SHORT).show();
return;
}
// Clear data
mIatResults.clear();
// Set parameters
// Clear parameters
mIat.setParameter(SpeechConstant.PARAMS, null);
// Set dictation engine type
mIat.setParameter(SpeechConstant.ENGINE_TYPE, SpeechConstant.TYPE_CLOUD);
// Set the data format of the returned results
mIat.setParameter(SpeechConstant.RESULT_TYPE, "json");
if (language.equals("zh_cn")) {
String lag = mSharedPreferences.getString("iat_language_preference",
"mandarin");
// Setup language
mIat.setParameter(SpeechConstant.LANGUAGE, "zh_cn");
// Set language region
mIat.setParameter(SpeechConstant.ACCENT, lag);
} else {
mIat.setParameter(SpeechConstant.LANGUAGE, language);
}
// This is used to set dialog Error code information is not displayed in the
mIat.setParameter("view_tips_plain", "false");
// Set the voice front endpoint : Mute timeout , That is, how long the user does not speak is treated as a timeout
mIat.setParameter(SpeechConstant.VAD_BOS, mSharedPreferences.getString(
"iat_vadbos_preference", "4000"));
// Set the end point after voice : Back end point mute detection time , That is, how long the user stops talking is considered not to enter , Automatically stop recording
mIat.setParameter(SpeechConstant.VAD_EOS, mSharedPreferences.getString(
"iat_vadeos_preference", "1000"));
// Set punctuation marks , Set to "0" The returned result has no punctuation , Set to "1" The returned result is punctuated
mIat.setParameter(SpeechConstant.ASR_PTT, mSharedPreferences.getString(
"iat_punc_preference", "1"));
// Set audio save path , Save audio format support pcm、wav, Set the path to sd Card, please note WRITE_EXTERNAL_STORAGE jurisdiction
mIat.setParameter(SpeechConstant.AUDIO_FORMAT, "wav");
mIat.setParameter(SpeechConstant.ASR_AUDIO_PATH,
Environment.getExternalStorageDirectory() + "/msc/iat.wav");
mIatDialog.setListener(mRecognizerDialogListener);// Set listening
mIatDialog.show();// Show dialog
}
notice show() There's no way ? With show() Method execution , The speech recognition function is completed

Grandpa asked , At this time, the grandson can't reply to his grandfather , Greetings to Grandpa , I can only look at Grandpa silently , Like a wooden man at a loss .
Grandson speaks , Need to use speech synthesis . speech synthesis , Contrary to phonic dictation , Speech synthesis is to convert a text into speech , Different timbres can be synthesized as required 、 Speed and tone of voice , Let the machine speak like a human , More Than This , Can also be based on personal needs , Change grandchildren , High end grandson 、 Granddaughters can often recognize national languages 、 Multilingual .
How does Sun Tzu know what message to reply to ?
The reply message is preset by us in advance , The difference in speech , The content of the reply is also different . There is a difference between speech recognition and command word recognition , Command word recognition is to recognize speech and extract pre-set keywords through text output . And speech recognition is to directly translate a string of speech into text for direct output .
I inserted grandpa's speech and grandson's reply in advance Sqlite, The text to be recognized and stored in Sqlite When Grandpa Li's speeches are consistent , Take the corresponding answer , Through voice output , Grandson is no longer a mute .
that , Let's realize the voice reply of grandson .
/** * Sun Tzu replied * @param answer Sun Tzu's reply */
private void grandsonAnswer(String answer) {
// Initialize the cloud speaker name list
cloudVoicersEntries = getResources().getStringArray(R.array.voicer_cloud_entries);
cloudVoicersValue = getResources().getStringArray(R.array.voicer_cloud_values);
// Initialize the composite object
mTts = SpeechSynthesizer.createSynthesizer(this, mInitListener);
// Clear parameters
mTts.setParameter(SpeechConstant.PARAMS, null);
// Set composition
// Set up to use the cloud engine
mTts.setParameter(SpeechConstant.ENGINE_TYPE, SpeechConstant.TYPE_CLOUD);
// Set the speaker
mTts.setParameter(SpeechConstant.VOICE_NAME, voicerCloud);
//mTts.setParameter(SpeechConstant.TTS_DATA_NOTIFY,"1");// Support real-time audio streaming , Only in synthesizeToUri Support... Under conditions
// Set the synthetic speed
mTts.setParameter(SpeechConstant.SPEED, "50");
// Set the synthetic tone
mTts.setParameter(SpeechConstant.PITCH, "50");
// Set the synthetic volume
mTts.setParameter(SpeechConstant.VOLUME, "50");
// Set the player audio stream type
mTts.setParameter(SpeechConstant.STREAM_TYPE, "3");
// mTts.setParameter(SpeechConstant.STREAM_TYPE, AudioManager.STREAM_MUSIC+"");
// Set to play synthetic audio to interrupt music playback , The default is true
mTts.setParameter(SpeechConstant.KEY_REQUEST_FOCUS, "true");
// Set audio save path , Save audio format support pcm、wav, Set the path to sd Card, please note WRITE_EXTERNAL_STORAGE jurisdiction
mTts.setParameter(SpeechConstant.AUDIO_FORMAT, "wav");
mTts.setParameter(SpeechConstant.TTS_AUDIO_PATH,
Environment.getExternalStorageDirectory() + "/msc/tts.wav");
int code = mTts.startSpeaking(answer, mTtsListener);
if (code != ErrorCode.SUCCESS) {
Toast.makeText(this, " Speech synthesis failed , Error code : " + code + ", Please click the website https://www.xfyun" +
".cn/document/error-code Query solution ", Toast.LENGTH_SHORT).show();
}
}
Code copy This is the result when it runs ↓
If you want to hear your grandson call you Grandpa , When calling the method, just pass a grandpa's parameter directly
public void callGrandpa(View view) {
grandsonAnswer(" grandpa ~");
}

I'm tired of hearing grandson's voice , You can also switch to the voice of your granddaughter
/** * Switch grandchildren */
public void update(View view) {
new AlertDialog.Builder(this).setTitle(" Change grandchildren ")
.setSingleChoiceItems(cloudVoicersEntries, // There are several items in the radio box , What are their names
selectedNumCloud, // Default options
new DialogInterface.OnClickListener() {
// Processing after clicking the radio box
public void onClick(DialogInterface dialog,
int which) {
// Which item was clicked
voicerCloud = cloudVoicersValue[which];
selectedNumCloud = which;
dialog.dismiss();
}
}).show();
}
Switchable grandchildren 、 Granddaughter
<!-- synthesis -->
<string-array name="voicer_cloud_entries">
<item> Xiaoyan </item>
<item> Xiaoyu </item>
<item> Catherine </item>
<item> Henry </item>
<item> Mary </item>
<item> Xiaoyan </item>
<item> Xiaoqi </item>
<item> Xiaofeng </item>
<item> Xiaomei </item>
<item> Xiaoli </item>
<item> Xiaorong </item>
<item> Rue </item>
<item> Xiao Kun </item>
<item> cockroach </item>
<item> Xiaoying </item>
<item> Xiaoxin </item>
<item> Nannan </item>
<item> Old sun </item>
</string-array>
<string-array name="voicer_cloud_values">
<item>xiaoyan</item>
<item>xiaoyu</item>
<item>catherine</item>
<item>henry</item>
<item>vimary</item>
<item>vixy</item>
<item>xiaoqi</item>
<item>vixf</item>
<item>xiaomei</item>
<item>xiaolin</item>
<item>xiaorong</item>
<item>xiaoqian</item>
<item>xiaokun</item>
<item>xiaoqiang</item>
<item>vixying</item>
<item>xiaoxin</item>
<item>nannan</item>
<item>vils</item>
</string-array>
Come here , All the functions you can think of have been completed , Enclosed demo function GIF

Finally, don't forget when the interface is destroyed , The corresponding variable should free memory
@Override
protected void onDestroy() {
super.onDestroy();
if( null != mIat ){
// Release connection on exit
mIat.cancel();
mIat.destroy();
}
if( null != mTts ){
mTts.stopSpeaking();
// Release connection on exit
mTts.destroy();
}
}
This article code has been uploaded to : Develop one that will call itself “ grandpa ” Of “ Grandson ”, What kind of experience is it ?
reference :
1、Android IFLYTEK speech recognition ( The detailed steps + Source code )
2、 IFLYTEK online voice dictation Android SDK file
3、 IFLYTEK online speech synthesis Android SDK file
边栏推荐
- Design specification for smart speakers v1.0
- Esp32 things (V): analysis of common API of esp32 of Swiss Army knife
- Opencv learning notes -day2 (implemented by the color space conversion function cvtcolar(), and imwrite image saving function imwrite())
- Mmdet line by line code interpretation of positive and negative sample sampler
- Redis design and Implementation (I) | data structure & object
- [untitled]
- c#获取当前的时间戳
- Opencv learning notes-day9 opencv's own color table operation (colormap coloraptypes enumeration data types and applycolormap() pseudo color function)
- Torchvision loads the weight of RESNET except the full connection layer
- Circuit analysis of current probe
猜你喜欢

Opencv learning notes-day6-7 (scroll bar operation demonstration is used to adjust image brightness and contrast, and createtrackbar() creates a scroll bar function)

TiDB v6.0.0 (DMR) :缓存表初试丨TiDB Book Rush

asdsadadsad

技术管理进阶——管理者如何进行梯队设计及建设

Redis design and Implementation (IV) | master-slave replication

Esp32 things (II): sharpening the knife without mistaking firewood - make preparations before project development

Opencv learning notes-day5 (arithmetic operation of image pixels, add() addition function, subtract() subtraction function, divide() division function, multiply() multiplication function

Talk about how the kotlin collaboration process establishes structured concurrency

Detectron2 source code reading 3-- encapsulating dataset with mapper

Metasploit practice - SSH brute force cracking process
随机推荐
Rew acoustic test (III): generate test signal
c#获取当前的时间戳
Talk about how the kotlin process started?
Is it safe to open an account? How can anyone say that it is not reliable.
Talk about writing
Introduction to the runner of mmcv
[paid promotion] collection of frequently asked questions, FAQ of recommended list
CUDA realizes matrix multiplication
Comparaison de deux façons d'accéder à la base de données SQL Server (sqldatareader vs sqldataadapter)
Deep understanding of continuation principle
Tidb v6.0.0 (DMR): initial test of cache table - tidb Book rush
Source code interpretation of detectron2 1--engine
About Lombok's @data annotation
CUDA realizes L2 European distance
Opencv learning notes -day2 (implemented by the color space conversion function cvtcolar(), and imwrite image saving function imwrite())
Opencv learning notes-day5 (arithmetic operation of image pixels, add() addition function, subtract() subtraction function, divide() division function, multiply() multiplication function
Enhance the add / delete operation of for loop & iterator delete collection elements
Rew acoustic test (II): offline test
Detailed explanation of pytoch's scatter function
Do you want the dialog box that pops up from the click?