当前位置:网站首页>Application and development trend of image recognition technology

Application and development trend of image recognition technology

2022-07-05 01:36:00 Xiaobai learns vision

Background of image recognition technology

Mobile Internet 、 The development of smart phones and social networks has brought massive image information , according to BI Article in May ,Instagram The number of pictures uploaded every day is about 6000 Thousands of copies ; This year, 2 month WhatsApp The number of pictures sent every day is 5 One hundred million ; The wechat circle of friends in China is also driven by image sharing . Pictures that are not restricted by region and language have gradually replaced cumbersome and subtle words , It has become the main medium to convey words and meanings . There are two main reasons why pictures have become the main media of Internet Information Exchange :

First of all , From the user's habit of reading information , Compared to words , Pictures can provide users with more vivid 、 Easy to understand 、 Interesting and more artistic information ;

second , From the source of the picture , Smart phones bring us convenient means of shooting and screenshots , Help us use pictures to collect and record information faster .

But with pictures becoming the main information carrier in the Internet , Problems arise . When information is written , We can easily find the content we need and edit it by keyword search , When the information is recorded by pictures , But we can't retrieve the content in the picture , This affects the efficiency of finding the key content from the pictures . Pictures bring us a quick way to record and share information , But it reduces our information retrieval efficiency . In this environment , Computer image recognition technology is particularly important .

Image recognition is the processing of images by computer 、 Analysis and understanding , Technology to identify targets and objects of various patterns . The recognition process includes image preprocessing 、 Image segmentation 、 Feature extraction and judgment matching . Simply speaking , Image recognition is how computers can read the contents of pictures like people . With the help of image recognition technology , We can not only get information faster through image search , It can also produce a new way to interact with the external world , It will even make the external world run more intelligently . Robin Li, Baidu 2011 Mentioned in “ A new era of image reading has come ”, Now with the continuous progress of graphic recognition technology , More and more technology companies are involved in the field of graphic recognition , This marks the official arrival of the era of image reading , And will lead us into a more intelligent future .

The primary stage of image recognition -- Entertainment 、 tools

At this stage , Users mainly use image recognition technology to meet some entertainment needs . for example , Baidu magic map “ Big coffee match ” The function can help users find the stars who best match their looks , Baidu's image search can find similar images ;Facebook Developed a method of face matching based on photos DeepFace; Image recognition company acquired by Yahoo IQ Engine Developed Glow It can automatically generate photo tags through image recognition to help users manage photos on mobile phones ; Kuangshi technology, a domestic startup focusing on image recognition, was established VisionHacker Game studio , With the help of graphic recognition technology, we develop somatosensory games on mobile terminals ; Chuangshi technology develops machine vision surface detection system through image recognition technology .

At this stage, there is also a very important segment ——OCR(Optical Character Recognition, Optical character recognition ), It refers to the optical equipment checking the characters printed on the paper , By detecting dark 、 The bright pattern determines its shape , The process of translating shapes into computer characters by character recognition , It is the reading of words by computers . Language and words are the most basic for us to obtain information 、 The most important way . In the world of bits , We can easily access and process words with the help of the Internet and computers . But once the text is shown in the form of pictures , It adds a lot of trouble to our acquisition and processing of words . On the one hand, it is manifested in the text stored in the digital world called picture format for specific reasons ; On the other hand, we see the words of all physical forms in real life . So we need to use OCR Technology extracts these words and information . In this regard , Domestic products include Baidu's painting notes and Baidu translation ; And Google relies on experience DistBelief Large distributed neural network trained , about Google The recognition rate of tens of millions of house numbers in the street view library exceeds 90%, Millions of house numbers can be recognized every day .

At this stage , Image recognition technology only exists as our auxiliary tool , It provides powerful assistance and enhancement for our own human vision , It brings us a new way to interact with the outside world . We can find the key information in the picture by searching ; You can take a picture of an unfamiliar object and quickly find all kinds of information related to it ; You can take photos of potential chat-up partners and go to her social network in advance ; Face recognition can also be used as the main way of identity authentication …… Although these applications look very common , But when image recognition technology penetrates into all aspects of our behavior habits , We are equivalent to outsourcing part of our vision to machines , Just like we have outsourced part of our memory to search engines .

This will greatly improve the way we interact with the outside world , Previously, our process of using scientific and technological tools to explore the external world was like this : The human eye captures the target information 、 The brain analyzes information 、 Into keywords that the machine can understand 、 Interact with the machine to get results . And when image recognition technology gives machines “ eyes ” after , This process can be simplified as : The human eye uses machines to capture target information 、 Machines and the Internet directly analyze information and return results . Image recognition makes the camera the key to decrypt information , We just need to aim the camera at something unknown , You can get the expected answer . As Baidu scientist Yu Kai said , The camera has become one of the important entrances to connect people and world information .

Advanced stage of image recognition -- A machine with vision

Mentioned above , The current image recognition technology is used as a tool to help us interact with the external world , It only provides an auxiliary role for our own vision , All actions need to be completed by ourselves . And when the machine really has vision , They are entirely possible to complete these actions instead of us . Current image recognition applications are like guide dogs for blind people , Guide the blind when they move ; In the future, image recognition technology will be integrated with other AI technologies to become a full-time housekeeper for the blind , There is no need for blind people to take any action , Instead, the housekeeper helps him finish everything . for instance , If image recognition is a tool , Just like we wear Google glasses when driving a car , It analyzes the external information and passes it to us , We then make driving decisions based on this information ; If image recognition is used in machine vision and artificial intelligence , This is like Google's driverless car , Machines can not only acquire and analyze external information , Also fully responsible for all driving activities , Let us be completely liberated .

《 Artificial intelligence : A modern method 》 I mentioned , In AI , Perception is to provide the machine with information about the world they live in by interpreting the response of sensors , Among them, their common perceptual forms with human beings include vision 、 Hearing and touch , And vision is the most important , Because vision is the basis of all actions . On a forum, baidu IDL President Yu Kai of asked everyone , Which feeling do you think is the most important ? No one can answer quickly , Later, president Yu Kai changed his way of asking questions , If you want to give up a feeling , What kind of ? At this time, everyone answered that it was visual .Chris Frith stay 《 The construction of mind 》 I mentioned , Our perception of the world is not direct , It depends on “ Unconscious reasoning ”, That is, before we can perceive objects , The brain must infer what this object may be based on the information reaching the senses , This constitutes the most important human ability to predict and deal with emergencies . And vision is the most timely and accurate information acquisition channel in this process , In human sensory information 80% It's all visual information . The significance of machine vision to artificial intelligence is the significance of vision to human , What determines machine vision is image recognition technology .

what's more , In some application scenarios , Machine vision has more advantages than human physiological vision , It's more accurate 、 Objective and stable . Human vision has natural limitations , We seem to perceive the world immediately and effortlessly , And it seems to be able to perceive the whole visual scene in detail , But this is just an illusion , Only the middle part of the visual scene projected to the center of the eyeball , We can see clearly in detail and bright colors . Deviate from the middle by about 10 The location of the degree of , Nerve cells are more dispersed and intelligent to detect light and shadow . in other words , At the edge of our visual world is colorless 、 Vague . therefore , We will exist “ Change blindness ”, Will experience a variety of things happen , Just focus on one of them , And ignore the occurrence of other things , And I don't know their occurrence . Machines have more advantages in this respect , They can find and record everything that happens within the scope of vision . Take the most widely used video surveillance , Traditional monitoring requires someone to be highly vigilant in front of the TV wall , Then draw a conclusion through your own judgment on the video , But this is often because of people's fatigue 、 Visual limitations and distraction affect the monitoring effect . But with mature image recognition technology , With the support of artificial intelligence , The computer can analyze and judge the video by itself , If abnormal conditions are found, alarm directly , It brings higher efficiency and accuracy ; In the field of counter-terrorism , Machine based face recognition technology is also far better than human subjective judgment .

Many technology giants have also begun their layout in the field of image recognition and artificial intelligence ,Facebook Signed AI experts Yann LeCun The most significant achievement is in the field of image recognition , It puts forward LeNet Represented by convolutional neural network , It has achieved good results when applied to various image recognition tasks , It is considered as one of the representatives of general image recognition system ;Google With the help of analog neural network “DistBelief” Through millions of copies YouTube The learning of the video has mastered the key characteristics of the cat by itself , This is that the machine understood the concept of cat without help . It is worth mentioning that , In charge of this project Andrew NG Has switched to Baidu to lead Baidu Research Institute , One of its important research directions is artificial intelligence and image recognition . This also shows the importance that domestic technology companies attach to image recognition technology and artificial intelligence technology .

Image recognition technology , Connecting the machine to this ignorant world , Help it understand the world better , And eventually complete more tasks instead of us .

原网站

版权声明
本文为[Xiaobai learns vision]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202141026251553.html