当前位置：网站首页>Machine learning makes character recognition easier: kotlin+mvvm+ Huawei ml Kit

Machine learning makes character recognition easier: kotlin+mvvm+ Huawei ml Kit

2022-07-29 05:50:00 【Quantify NPC】

brief introduction

The process that computers can obtain information from images or video streams belongs to the field of computer vision .
Text recognition service is a part of computer vision . It is also the hottest AI application at present , Text recognition services can recognize receipts 、 Business card 、 Documents, photos and other pictures with text , Extract the text information . Turn on the camera and you can easily convert the text in pictures or real scenes into text messages .

Use scenarios

Under the express delivery scenario , By identifying the uploaded pictures , Can quickly send the recipient's name 、 Telephone 、 Fill in the corresponding position with important information such as the recipient's address
Combine the translation function to achieve real-time photo translation , Facing the menu that I didn't understand , Unreadable road signs, etc , Use this function , Let users in foreign countries do not have to worry about not understanding words , We have written the relevant code , You can visit the blog to view the specific implementation process
https://blog.csdn.net/weixin_38132951/article/details/107352702
It is often used pdf Document transfer word, In essence, it also uses the ability of text recognition . meanwhile , In daily life, sometimes you need to extract the text in the picture and send it to your friends or write it in a memo .

Ability demonstration

First, let's take a look at using Huawei MLKit Text recognition ability .
Insert picture description here

Function description

Insert picture description here

Identify languages

This service can support both device side and cloud side , However, the types of characters it can recognize are different . When calling the device interface , Only Chinese can be recognized （ Simplified Chinese character ）、 Japanese 、 Korean 、 Latin characters （ For the supported Latin characters, see the Latin characters supported by the character recognition client ）. When calling the cloud interface , Can recognize Chinese （ Simplified Chinese character ）、 english 、 Spanish 、 Portuguese 、 Italian 、 German 、 French 、 ru 、 Japanese 、 Korean 、 Polish 、 Finnish 、 Norwegian 、 Swedish 、 Danish 、 Turkish 、 Thai 、 arabic 、 Hindi 、 Indonesian and other languages .

Text recognition features	Text recognition features
End side	Supporting 、 Japan 、 Han 、 Latin characters
Cloud side	in 、 Britain 、 Law 、 In the west 、 Thai, etc 19 Languages
Tilt recognition	30 It can still be recognized in the case of degree inclination
Curved text support	Support 45 Degree bending can still be successfully identified
Text tracking	The end side supports tracking

Recognition mode

The service supports static image recognition and dynamic video stream recognition , Synchronous and asynchronous call methods , By providing a wealth of API, It can help developers quickly build various text recognition applications .

See the official website of Huawei developer Alliance for details ：
https://developer.huawei.com/consumer/cn/doc/development/HMSCore-Guides/text-recognition-0000001050040053
Insert picture description here

Integration steps

step 1： stay Android Studio New construction in China

step 2： Select dependencies according to project requirements

// Import the base SDK.
implementation 'com.huawei.hms:ml-computer-vision-ocr:1.0.3.300'
// Import the Latin-based language model package.
implementation 'com.huawei.hms:ml-computer-vision-ocr-latin-model:1.0.3.315'
// Import the Japanese and Korean model package.
implementation 'com.huawei.hms:ml-computer-vision-ocr-jk-model:1.0.3.300'
// Import the Chinese and English model package.
implementation 'com.huawei.hms:ml-computer-vision-ocr-cn-model:1.0.3.300'

Here's each of them SDK Size

Package type	Package name	Bag size
Latin model	ml-computer-vision-ocr-latin-model	952 KB
Japanese Korean model	ml-computer-vision-ocr-jk-model	2.14 MB
Chinese and English models	ml-computer-vision-ocr-cn-model	3.46 MB

If you want a lean version, use the following dependencies

implementation 'com.huawei.hms:ml-computer-vision-ocr:1.0.3.300'

Automatically update machine learning models

<meta-data
    android:name="com.huawei.hms.ml.DEPENDENCY"
    android:value="ocr" />

stay manifest Add the following permissions to the file

<uses-permission android:name="android.permission.CAMERA" /> 
<uses-permission android:name="android.permission.INTERNET" />

Let's jump to TextRecognitionViewModel Class , In this class , We received a bitmap containing the user image .
Here is what you can use to call text recognition API And get the String Code of response .

fun textRecognition() {
    
     val setting = MLRemoteTextSetting.Factory()
         .setTextDensityScene(MLRemoteTextSetting.OCR_LOOSE_SCENE)
         .setLanguageList(object : ArrayList<String?>() {
    
             init {
    
                 this.add("zh")
                 this.add("en")
                 this.add("hi")
                 this.add("fr")
                 this.add("de")
             }
         })
         .setBorderType(MLRemoteTextSetting.ARC)
         .create()
     val analyzer = MLAnalyzerFactory.getInstance().getRemoteTextAnalyzer(setting)
     val frame = MLFrame.fromBitmap(bitmap.value)
     val task = analyzer.asyncAnalyseFrame(frame)
     task.addOnSuccessListener {
    
         result.value = it.stringValue
     }.addOnFailureListener {
    
         result.value = "Exception occurred"
     }
 }

I want to use cloud services , So choose MLRemoteTextSetting()
According to the density of characters , We can setTextDensityScene() Set to OCR_LOOSE_SCENE or OCR_COMPACT_SCENE
Once the density is set , We will pass setLanguageList () Set the text language .
We can pass on a ArrayList Give it a collection object of . I have added 5 Languages , But you can add languages as needed .
MLRemoteTextSetting.ARC： Returns the vertices of the polygon boundary in arc format .
Now? , Our customization MLRemoteTextSetting The object is ready , We can pass it on to MLTextAnalyzer object .
The next step is to create a MLFrame

**· MLFrame frame= MLFrame.fromBitmap（bitmap）;**

On parser object , We're going to call asyncAnalyseFrame （ frame ） And provide our recently created MLFrame.

This will generate a Task object , On this object , You will get 2 A callback .

on success
onFailure

Can be in **onSuccess() Save new resources in , And pass analyzer.stop()** Method to stop the parser , Release detection resources .

Just in case , You want to use only the following changes in the device model .

MLTextAnalyzer analyzer = MLAnalyzerFactory.getInstance().getLocalTextAnalyzer();
MLLocalTextSetting setting = new MLLocalTextSetting.Factory()
　　.setOCRMode(MLLocalTextSetting.OCR_DETECT_MODE)
　　.setLanguage("en")
　　.create();