当前位置：网站首页>Using OCR to reverse recognize text content in airtest

Using OCR to reverse recognize text content in airtest

2022-07-28 17:34:00 【itest_ two thousand and sixteen】

airtest Netease is a development based on image recognition ui Automation framework , The process of element operation is to take a screenshot of the element first , Then compare with the screenshot of the current interface of the client , Find the location , Perform the click operation .

This operation eliminates the complexity of traditional search controls , But the assertion of elements is not precise , Especially in not only judging whether elements exist , And also check the content of the text , It can be said that there is nothing I can do .

I want to solve this problem , You can use the traditional element search method （airtest Built in the way of element search at each end api）, Found element , Then judge his text attribute , Get the content ; But in this way, first, we need to add traditional methods such as xpath The relationship between , Second, many times due to non-standard development and other reasons , The positioning attribute of an element is often not easy to determine , Spending time searching has actually lost the convenience of image recognition controls .

In practice （ My project is pc End + The scene of double end interaction of Andrade ）, I found it possible to take advantage of airtest Of itself api, Plus ocr Of python library , Identify the content of elements , To solve this problem .

The general idea is , First use airtest Of itself api, Take the screenshot of the content to be identified , And then use it ocr Library to identify its content .

Theoretically , If your device resolution does not change , In fact, there is no need to use airtest Of api, Directly use fixed position , adopt opencv Wait for the image processing library to save the matting , However, the resolution of the production equipment cannot be guaranteed , therefore , Need a relative position , For this position ,airtest There are direct methods available in , This method is also its own resolution compatible method .

airtest Through the coordinates recorded during your screenshot , Device resolution , To generate a recordpos, This pos In fact, it is used to calculate the coordinate offset when the resolution is different .

How to pass recordpos Calculate the coordinates , I won't repeat it here , Mainly through recordpos How to find elements , The method is get_predict_area, There are four parameters , Namely record_pos（ The offset when intercepting the control image ）、image_wh（ Width and height of space ）、image_resolution（ The device resolution recorded when capturing pictures ）、screen_resolution（ The actual resolution of the device where the control is found ）.

This method will return a quadrangular coordinate , We use the built-in image processing method for this quadrangular coordinate , Save screenshots on the equipment and drawings , You can get the picture of the control to be recognized , And then use ocr The library recognizes this picture , You can get the text content of the control .

The sample code is as follows （ Only applicable to my hardware device , Used as a reference ）：

        dev = device()
        stu_answer_num = r'./valid_pic/stu_answer_num.png'  #  Screenshot path of the answer area 
        interact_pic_path = r'./valid_pic/snap_interact.png'  #  Screenshot path of interactive interface 
        interact_image = dev.snapshot(interact_pic_path)  #  Save the screenshot of the interactive interface 

        from PIL import Image
        import pytesseract
        from airtest import aircv
        from airtest.core.cv import Predictor

        screen_resolution = aircv.get_resolution(interact_image)  #  Get the actual resolution of the screenshot of the interactive interface 
        #  Dynamically create classes that inherit from Predictor, Modify class parameters , Offset value DEVIATION by 0, To obtain accurate screenshot area 
        predictor = type('Pos', (Predictor,), {'DEVIATION': 0})
        xmin, ymin, xmax, ymax = predictor.get_predict_area(record_pos=(0.404, -0.213), image_wh=(265, 45),
                                                            image_resolution=(1919, 1040),
                                                            screen_resolution=screen_resolution)  #  Get the screenshot coordinates of the answer area 
        predict_area = aircv.crop_image(interact_image, (xmin, ymin, xmax, ymax))
        aircv.imwrite(stu_answer_num, predict_area)  #  Save the screenshot of the area to be identified 
        answer_str = pytesseract.image_to_string(Image.open(stu_answer_num), 'chi_sim')  #  The recognized words are similar ’ whole   class   common   ginseng   And  : 1/1‘
        print(answer_str)
        if answer_str.find(' ginseng   And '):
            answer_str = answer_str.split(' ')[-1]  #  take 1/1
            answer_num = int(answer_str.split('/')[0])  #  Take the number of answers 
            return answer_num

return

（ Please forward if you like , thank you ！）

Join beta future qq Group , Get more professional technical knowledge sharing ：

274166295 （ Love to test the future two groups ）

610934609 （ Love to test the future three groups ）

195730410 （ Love to test the future four groups ）