当前位置:网站首页>Product identification of intelligent retail cabinet based on paddlex
Product identification of intelligent retail cabinet based on paddlex
2022-07-04 14:07:00 【Short section senior】
pick want
In Traditional Retail cabinets , The main methods to realize automatic recognition are : Hardware separation 、 Judging by weight 、 Identify customer behavior 、 RFID marks, etc . This paper is based on the excellent performance of deep learning in the field of image classification , The research is based on PaddleX Intelligent retail cabinet product identification , Compared with scanning the bar code of goods manually or scanning the code by the customer's own machine 、 And the above automatic identification methods are more efficient , This experiment draws lessons from the original project , In the use of PaddleX In the process of model training , Detection model use PPYolo perhaps YOLOv3. The backbone network adopts ResNet50, Training is sloppy , The effect is not so obvious .
key word Image classification ;PaddleX; Convolutional neural networks ; Feature learning ; Image recognition
introduction
With the rapid development of China's economy , The level of national disposable income is also rising , People's focus on shopping has slowly shifted from price to the experience and feeling of the consumption process . However , Whether it's a large supermarket , Or small convenience stores , In densely populated areas and during peak consumption ( Such as weekend ) There is always a phenomenon of settlement queuing , This undoubtedly reduces the shopping experience of consumers . To alleviate the phenomenon of settlement queuing , You need to increase the cashier , Add settlement channels , The increase of labor cost of this solution is too expensive . The emergence of self-service code scanning settlement technology solves the problem of increasing labor costs , But in essence, this technology only transfers the operation of scanning barcode from cashier to consumer . In the process , Customers may encounter barcode scanning failure , Unable to complete settlement and other issues , Make shopping settlement take more time . therefore , The technology is still complicated , Low settlement efficiency . Some advanced technical means , Like big data , Artificial intelligence is applied to the sales process of goods , Change the traditional settlement method of scanning bar code , make “ The new retail ”, It has become an inevitable trend .
In recent years , Some automated retail stores have emerged at home and abroad , Unmanned convenience stores, etc , Such as JD convenience store , Alibaba unmanned supermarket , This shows that artificial intelligence is used to transform the sales process of goods , Automate retail scenarios , Unmanned has become one of the hotspots in the field of artificial intelligence . Compared with new retail , There are many processes of manual participation in traditional retail , Low level of Automation , Resulting in high costs , At the same time, the efficiency of service and the comfort of experience are low . The development of computer vision makes the technology of commodity recognition more and more mature , Using image-based commodity recognition technology can improve the degree of Automation , Cost reduction , Increase of efficiency , therefore , Design and develop an automatic identification and settlement system of batch commodities based on computer vision , It has important research and application value .
Object detection is an important branch of image processing and computer vision , It has been widely used in many fields . As a typical representative of intelligent retail system , It can provide automatic sales service without salesperson . In Traditional Retail cabinets , The main methods to realize automatic recognition are : Hardware separation 、 Judging by weight 、 Identify customer behavior 、 RFID marks, etc . These traditional methods are expensive , Reduce the space utilization of the cabinet , Limit the types of goods . Adopt commodity identification technology , Compared with scanning the bar code of goods manually or scanning the code by the customer's own machine 、 And the above automatic identification methods are more efficient , Stores can also increase profits , For customers, it can also reduce the waiting time of customers in line , The consumption process is more convenient .
Here it is , Commodity inspection , That is to obtain the specific position coordinates of the goods in the image . In the commodity identification system , The task of commodity identification is based on the task of commodity inspection , When an image contains multiple products , Only after obtaining the position of the goods in the image , To use the recognition algorithm to obtain its category . If the accuracy of commodity inspection is high , Area accuracy , Then the image sent into the recognition algorithm will eliminate many background interference , Improve the accuracy of product identification . The target detection algorithm is used to find all the objects of interest in the image , That is to determine the position of the object in the image , Generally, it is represented by a rectangular box , At the same time, the objects in the rectangular box are classified , Determine its category , This is one of the core problems in the field of computer vision . We can use the target detection algorithm to complete the task of commodity detection , But the appearance of various commodities , Different shapes , The state of customers on the settlement table is also strange , There is also light , Shielding and other influencing factors , These make commodity inspection a very challenging problem .
The method to be used in the project : Learn from the original project , In the use of PaddleX In the process of model training , Detection model use PPYolo perhaps YOLOv3. The backbone network adopts ResNet50. The total data volume of the dataset is 5422 Zhang , share 113 Commodity , It's a multi classification problem .
1.PaddleX brief introduction
PaddleX Image classification in the field of integrated intelligent vision 、 object detection 、 Semantic segmentation 、 Instance segmentation task capability , The whole process of deep learning development starts from data preparation 、 Model training and optimization to multi terminal deployment, end to end , And provide a unified mission API Interface and graphical development interface Demo. Developers don't need to install different packages separately , In the form of low code, we can quickly complete the whole process development of the propeller .
PaddleX After quality inspection 、 Security 、 On-Site Inspection 、 remote sensing 、 retail 、 More than a dozen industries, such as medical treatment, have been verified in practical application scenarios , Practical experience of precipitation industry , And provide a wealth of case practice course , Help developers to practice in the whole process .
As a whole PaddleX It has the following three advantages :
One 、 Through the whole process, the in-depth learning and development will be accessed from data 、 model training 、 Parameter tuning 、 Model to evaluate 、 Through the whole process of prediction and deployment , It eliminates the code development and script calls between the links , Greatly improve the development efficiency .
Two 、 The open source technology kernel integrates PaddleCV Leading visual algorithms and task oriented development kits 、 Pre training model application tool PaddleHub、 Visual analysis tools VisualDL、 Model compression tools PaddleSlim And other technical capabilities , And provide concise and easy to understand Python API, Realize full open source , Easy integration and secondary development , Help your business practice in the whole process .
3、 ... and 、 The industry is deeply compatible and highly compatible Windows、Mac、Linux System , Support at the same time NVIDIA GPU Accelerate deep learning training . Local development 、 Ensure data security , It is highly in line with the actual needs of industrial applications .
2.YOLO brief introduction
YOLO yes “You Only Look Once” For short , Although it is not the most accurate algorithm , But the tradeoff between accuracy and speed , The effect is also quite good .YOLOv3 Learn from it YOLOv1 and YOLOv2, Although there are not many innovations , But keep YOLO The advantage of family speed , Improved detection accuracy , Especially for the detection ability of small objects .YOLOv3 The algorithm uses a single neural network to act on the image , The image is divided into multiple regions and the bounding box and the probability of each region are predicted .
YOLOv3 Use only convolution layers , Make it a full convolution network (FCN). In the article , A new feature extraction network is proposed ,Darknet-53. As its name suggests , It contains 53 Convolution layers , Each is followed by batch normalization Layer and the leaky ReLU layer . No pooling layer , Use steps of 2 The convolution layer replaces the pooling layer for the downsampling process of the characteristic graph , This can effectively prevent the loss of low-level features caused by the pooling layer .
3. Data preprocessing
The collected data is in a stable light source , The image is well illuminated , Excellent image quality , But in real application scenarios , The lighting environment is very complex , The environment arranged in the laboratory cannot completely simulate the light in the actual environment , Therefore, we need to preprocess the input image . In addition to normalizing the image , Data enhancement operations such as image flipping are also carried out , The most important thing is the brightness of the image , Saturation is adjusted , To simulate the complex external lighting environment , Enhance the robustness of the model to illumination transformation .
Data sets are divided into training data sets 、 Training commodity Library . The training data set contains image data and annotation information . The image data set is a dense commodity image , Size 960x720, The format is jpg. The dataset uses VOC Format , It meets the requirements of most deep learning development kits for data set format , To satisfy paddlex or PaddleDetection Training requirements . The total data volume of this data set is 5422 Zhang , And all pictures have been marked , share 113 Commodity . This data set is used to divide the data set , One of the training sets 3796 Zhang 、 Verification set 1084 Zhang 、 Test set 542 Zhang , Some of them are shown in the figure below :
4. model training
In the use of PaddleX In the process of model training , We use VOCDetection Kit for training .
PaddleDetection built-in 30+ Model algorithm and 250+ Pre training model , Coverage target detection 、 Instance segmentation 、 track 、 Key point detection and other directions , Including server-side and mobile side high-precision 、 Lightweight industrial grade SOTA Model 、 Champion scheme and academic frontier Algorithm , And provide configured network module components 、 More than ten kinds of data enhancement strategies, loss functions and other high-order optimization support and a variety of deployment schemes , Getting through data processing 、 Model development 、 Training 、 Compress 、 Deploy the whole process , Provide rich cases and tutorials , Accelerate the implementation and application of algorithm industry .https://github.com/PaddlePaddle/PaddleDetection - %E6%8F%90%E4%BE%9B%E7%9B%AE%E6%A0%87%E6%A3%80%E6%B5%8B%E5%AE%9E%E4%BE%8B%E5%88%86%E5%89%B2%E5%A4%9A%E7%9B%AE%E6%A0%87%E8%B7%9F%E8%B8%AA%E5%85%B3%E9%94%AE%E7%82%B9%E6%A3%80%E6%B5%8B%E7%AD%89%E5%A4%9A%E7%A7%8D%E8%83%BD%E5%8A%9B Provide target detection 、 Instance segmentation 、 Multitarget tracking 、 Key point detection and other capabilities , And https://github.com/PaddlePaddle/PaddleDetection/blob/release/2.4/docs/images/ppdet.gifhttps://github.com/PaddlePaddle/PaddleDetection - %E5%BA%94%E7%94%A8%E5%9C%BA%E6%99%AF%E8%A6%86%E7%9B%96%E5%B7%A5%E4%B8%9A%E6%99%BA%E6%85%A7%E5%9F%8E%E5%B8%82%E5%AE%89%E9%98%B2%E4%BA%A4%E9%80%9A%E9%9B%B6%E5%94%AE%E5%8C%BB%E7%96%97%E7%AD%89%E5%8D%81%E4%BD%99%E7%A7%8D%E8%A1%8C%E4%B8%9A Application scenarios cover industry 、 Smart city 、 Security 、 traffic 、 retail 、 More than ten industries such as medical treatment
Its data model is large , Predicted speed ratio YOLOv3-DarkNet53 faster , It is suitable for the server . Of course , You can also change other models . Training for several rounds , Final mAP You can achieve 65% above , The visualization of the training process is as follows :
From above loss We can see from the curve , As the number of training increases , The loss of models is declining . The training of the model gradually meets the expectation , The value of the loss function tends to flatten . Change the strategy according to the setting of the effect , At this time, adjustments are made , The loss value drops again , The final training effect tends to be normal , Model training is basically completed .
5. The evaluation index
The evaluation indicators of commonly used target detection models are IoU,recall,precision,mAP,Accuracy etc. .
(1)IOU: Occurring simultaneously than (Intersection over Union,IoU), It is a way to measure the positioning accuracy . Occurring simultaneously than , seeing the name of a thing one thinks of its function , Is a function of calculating the proportion of the intersection and union of two bounding boxes . In the computer detection task , If IOU≥ 0.5, It is considered that the predicted bounding box result is correct , among 0.5 It's a threshold , It is set according to experience . If the requirements for test results are very strict , You can also set the threshold appropriately higher , such as 0.6,0.7; But the threshold must be less than 1, Greater than 0 The numerical , Because if the predicted bounding box and the actual bounding box completely coincide ,IOU= 1. The threshold is rarely set to 0.5 following .
(2)recall,precision: Recall rate (recall), Represent the recall of a class ( Check all ) effect , Is the correct frequency predicted in the example with a positive label .
Accuracy (precision), Represent the classification effect of the classifier ( Check accuracy ), It is to predict the correct frequency value in the example with positive prediction .
among , Real examples (TP) It means to predict a positive sample as a positive sample , False counter example (FN) Indicates that a positive sample is predicted to be a negative sample , False positive example (FP) Indicates that the negative sample is predicted to be a positive sample , True counter example (TN) Indicates that the negative sample is predicted to be a negative sample . In the detection task ,IOU The threshold value of is set to 0.5, be TP Express IOU Greater than 0.5 The number of detection boxes ,FP Express IOU Less than or equal to 0.5 The number of detection boxes ,FN Indicates the number of detection boxes that should have boxes but no prediction result box .
(3)mAP: Various types AP Average value (mean average precision,mAP). To calculate mAP, You need to draw the PR curve (precision-recall curve ) To figure out AP,AP yes PR The area under the curve , namely 0-1 Between all of recall It's worth it precision The mean of the values . Get all kinds of AP after , For all kinds of AP Find the average value mAP.
(4)Accuracy : Accuracy will test set images (all) Every image in is input into the network , Carry out forward to get the prediction results ( Bounding boxes and categories ), If the prediction result is consistent with the result in the annotation file , Including the number of detection frames , Position is the correct number of pictures (true) Add one .
6. test result
Test the model on the self built test set . The experimental platform here is with GPU Server for , The experimental programming development environment is MATLAB2019b, The computing device is a personal computer , by Intel Core i7-9750H , The main frequency is 2.60GHz, To display as 8GB. Memory is 16GB, Only one piece is used in the test GPU. The detection module only needs to locate the location of the goods , Therefore, all test results have nothing to do with the product category . During the test IOU The threshold value of is set to 0.5, The test results are shown in the following figure , And show some result images .
From the test results shown , The detection model has strong robustness , Influence of light and other factors ( The picture is darker or brighter ) It will not have much impact on the test results . The model has the best effect on single object detection , The accuracy is close 80%, In multi object detection , The effect is slightly inferior , But it also achieved satisfactory results . Several cases of poor performance of the model are summarized here :(1) When there are many objects on the table and they are placed closely , It's easy to make detection box errors ( False check )(2) In the detection of products that are too small or blocked in the image , It is easy to see that the target product is not detected ( Missed inspection ) The situation of .
Due to severe shielding , The model did not detect the product , Even the human eye may ignore the existence of this blocked commodity at the beginning .
7. Summary and prospect
The main research content of this paper is the product identification system for intelligent retail , Many literatures have been consulted for this research content , The research background and significance of commodity identification are summarized . Understand the current research status of commodity identification technology at home and abroad , At present, there are quite a few schemes for intelligent detection of goods at home and abroad , Each have advantages and disadvantages , But the plans have matured , This experiment is just a reproduction of the previous projects , Because the training time is short , The training effect is still not ideal , We will make improvements in the future .
Compared with the experimental environment , The settlement environment in real life scenarios is more complex and changeable , Therefore, there are still many aspects to be improved and optimized :
(1) Increase the number of cameras in the settlement desk , Image acquisition of commodities from more perspectives . The product identification system designed this time uses two cameras , It is easy for commodities to block each other in the collected image , Cause recognition errors . This problem can be solved by collecting images from more angles , So as to further improve the accuracy of commodity identification .
(2) Use fewer training images , Get a better model . This paper uses a few commodity images for model training and testing , And this is only part of the category of goods , The types of goods in real life are far greater than this , The number of dataset images required is immeasurable . Therefore, it is very important to find a method that uses a small amount of image training to get a better model effect .
ginseng Examination writing offer
[1] be based on PaddleX Product identification of unmanned cabinet demo .https://aistudio.baidu.com/aistudio/projectdetail/3474742?channelType=0&channel=0&qq-pf-to=pcqq.group
[2] Intelligent retail cabinet commodity identification .https://aistudio.baidu.com/aistudio/projectdetail/2250826?channelType=0&channel=0
[3] Chi Haitao . Research on the business structure in the new retail era from the perspective of artificial intelligence [J]. Business economics research ,2019(09):51-53.
[4] Chen Jingwen . Design and implementation of automatic commodity identification system for intelligent retail [J] Hangzhou University of Electronic Science and technology 2021(02)
[5] With jade Teng . Research on commodity recognition based on deep learning [D]. Qingdao University of science and technology ,2019.
[6] Research on supermarket commodity image recognition method based on deep learning [D]. Hu Zhengwei . University of science and technology of China 2018
Welcome to join me for wechat exchange and discussion ( Please note csdn Add )
边栏推荐
- 中邮科技冲刺科创板:年营收20.58亿 邮政集团是大股东
- Summary of recent days (non-technical article)
- 吃透Chisel语言.07.Chisel基础(四)——Bundle和Vec
- 锐成芯微冲刺科创板:年营收3.67亿拟募资13亿 大唐电信是股东
- . Net delay queue
- 【R语言数据科学】:交叉验证再回首
- go语言中的文件创建,写入,读取,删除(转)
- ASP. Net core introduction I
- 使用默认路由作为指向Internet的路由
- BLOB,TEXT GEOMETRY or JSON column 'xxx' can't have a default value query 问题
猜你喜欢
Animation and transition effects
Automatic filling of database public fields
2022年山东省安全员C证考试题库及在线模拟考试
Redis - how to install redis and configuration (how to quickly install redis on ubuntu18.04 and centos7.6 Linux systems)
硬件基础知识-二极管基础
Understanding and difference between viewbinding and databinding
JVM 内存布局详解,图文并茂,写得太好了!
源码编译安装MySQL
如何在 2022 年为 Web 应用程序选择技术堆栈
基于链表管理的单片机轮询程序框架
随机推荐
markdown 语法之字体标红
吃透Chisel语言.12.Chisel项目构建、运行和测试(四)——Chisel测试之ChiselTest
1200. Minimum absolute difference
MySQL5免安装修改
Distributed base theory
DGraph: 大规模动态图数据集
js中的变量提升和函数提升
安装trinity、解决报错
美国土安全部部长警告移民“不要踏上危险的旅程”
国内酒店交易DDD应用与实践——代码篇
好博医疗冲刺科创板:年营收2.6亿 万永钢和沈智群为实控人
学习项目是自己找的,成长机会是自己创造的
免费、好用、强大的轻量级笔记软件评测:Drafts、Apple 备忘录、Flomo、Keep、FlowUs、Agenda、SideNote、Workflowy
IDEA快捷键大全
吃透Chisel语言.04.Chisel基础(一)——信号类型和常量
吃透Chisel语言.07.Chisel基础(四)——Bundle和Vec
Gorm 读写分离(转)
OpenHarmony应用开发之如何创建DAYU200预览器
MySQL8版本免安装步骤教程
sharding key type not supported