当前位置:网站首页>In depth understanding of the construction of Intelligent Recommendation System
In depth understanding of the construction of Intelligent Recommendation System
2020-11-06 01:15:00 【InfoQ】
{"type":"doc","content":[{"type":"heading","attrs":{"align":null,"level":2}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/af/af9f6637b50b09be60b00a42f3812d5e.png","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Yun Mei guides reading :"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" stay 《 Exhibition cloud technology interpretation 》 Special topic , We've launched "},{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s?__biz=MzU1OTgxMTg2Nw==&mid=2247494756&idx=1&sn=3628afb8c6b0053d7e62b2ac94e643c3&scene=21#wechat_redirect","title":null},"content":[{"type":"text","text":" Safety "}]},{"type":"text","text":" And "},{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s?__biz=MzU1OTgxMTg2Nw==&mid=2247494818&idx=1&sn=4b294480df8370df767388ecaa9988ff&scene=21#wechat_redirect","title":null},"content":[{"type":"text","text":" Design "}]},{"type":"text","text":", This paper introduces how to deal with the most stringent security requirements of cloud exhibition and the method based on service design in online exhibition , In this article, Yunmei will continue to bring you a very important link in the exhibition cloud —— Intelligent recommendation system ."}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Intelligent recommendation is ubiquitous in today's Internet products , It can be based on the gender of each user 、 Age 、 Hobbies and other dimensions shape static user portraits , And every time the user clicks 、 give the thumbs-up 、 Comment on 、 Collection and other behavioral data to form a dynamic user portrait combination , To mine the user's deep-seated interest needs dimension ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" We often have news recommendation and e-commerce scene product recommendation , The difference between the exhibition scene recommendation system and it is , It needs to satisfy "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" exhibitor "},{"type":"text","text":"、"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Buyers "},{"type":"text","text":" and "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Individual users "},{"type":"text","text":" All parties need , Especially like the never ending Cloud Service Trade Fair held not long ago , For the first time online + Offline combination mode , Extend the radiation cycle from a week of concentration to a whole year , exhibitor 、 Purchasers and individual users who are looking for business opportunities can browse the cloud service trade association anytime and anywhere to find valuable business opportunities ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" There are nearly ten thousand registered exhibitors , It involves a large number of exhibits , involve "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"200 many "},{"type":"text","text":" Sub industry . How to let online users quickly find the business opportunities they want from a large number of exhibitors' information ? How to maintain the continuous acquisition of effective business opportunities ? These problems are the key actions to improve the exhibition experience and efficiency . In the process , Jingdong Zhilian cloud machine learning team has undertaken "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Development of intelligent recommendation function of cloud service trade association "},{"type":"text","text":"."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/5d/5d2bcd5d36fc7d1ea760b802d07f50c0.webp","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" You can see from the above picture that , The intelligent recommendation system includes four modules , At the same time service official website 2D Shops and mobile phones APP End , Can achieve user level personalized recommendation . For the service trade association "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Exhibitors "},{"type":"text","text":"、"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" exhibition booth "},{"type":"text","text":"、"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Exhibits "},{"type":"text","text":"、"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" project "},{"type":"text","text":" Four important messages , Intelligent recommendation system has corresponding "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Exhibitors recommend "},{"type":"text","text":"、"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" The stand recommends "},{"type":"text","text":"、"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Exhibits recommended "},{"type":"text","text":" and "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Project release recommendation "},{"type":"text","text":" Four modules ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" among , Exhibitors 、 The functions of the three modules of exhibition stand and exhibit recommendation introduce user portraits of purchasers and individuals 、 Accurate matching of interest tags and behavior data . What is more difficult to implement is the recommendation of project release , Because in addition to considering user profiles and interest tags and other dimensional data , Considering the timeliness and strong purpose of the project , Also need to introduce high weight content dimension data for recommendation ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" In this intelligent recommendation function implementation process, in addition to how to more accurately achieve the project release recommendation , also 3 Big problem :"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","text":" The whole cloud service trade , The content presented by intelligent recommendation is close to 80% Users of “ At first sight ”, So how to bring the best accurate recommendation to users in the first time is a more difficult problem . In addition, this is the first cloud service trade fair , There is no historical information to use . How to make the most value of the user's first time traffic is the problem that the whole project continues to think about ;"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","text":" Although the registered exhibitors of the whole cloud service trade fair are not as large as the e-commerce platform , But in 9 month 5 Japan -9 month 9 The same challenge of high concurrency and performance is required during the offline exhibition period of Japan , Good architecture and system design is a solid line of defense that can withstand the test ;"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","text":" It's solved “ At first sight ” The recommendation of , How to make a second eye 、 Third eye …… The recommendation of , Besides doing user portraits well , In the exhibition content depiction unceasing exploration ."}]}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" meanwhile , We're also thinking about : about “ Never lonely Service Trade Association ” How to continue to make follow-up recommendations ? Different from Internet product news recommendation and e-commerce scene product recommendation , How to make the recommendation of exhibition scene satisfy all parties ( exhibitor 、 Purchasers and individual users ) The way to recommend needs ?"}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e6/e68a1cf97d4d5e2fc5161ee727f337a0.webp","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" At present, the market is facing C End user products such as headlines 、 Taobao and various music APP In order to do a good job in the cold start of the recommendation process, it can be said that each has his own magic power —— Get as much data as possible through multiple channels , For example, for the first time “ Related microblog / WeChat /QQ” Account ; For example, ask users about their preferences and interests ; For example, basic user information collection ( Gender 、 Age 、 Region and industry ). Whether through passive information acquisition or active user intention selection , They all aim to complete the cognition of users , In addition, label feature extraction of fine granularity of distribution content , Can achieve personalized cold start recommendation ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Another high threshold for cold start recommendations is :"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Deep understanding of user scenarios and behavioral motivations , Enough knowledge base precipitation ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" But the intelligent recommendation scene of the cloud service trade fair , Neither of the above roads seems easy to go . As it is the first time to participate in the recommendation of exhibition scenes , Even if it seems that the recommended products are similar to those of Jingdong , There are booths and exhibits , But there is a big difference between the user's group portrait and the intention to visit the exhibition . Fortunately, Jingdong Zhilian cloud has been enabling for a long time ToB The business of , It's precipitated. Yes B Cognition of end enterprise purchasing scene ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Especially in 2020 During the outbreak at the beginning of , In order to provide enterprises and government with efficient epidemic prevention equipment procurement , Jingdong Zhilian cloud launched "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"“ Emergency resource information release platform ”"},{"type":"text","text":". Provide "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" purchase "},{"type":"text","text":" and "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Release "},{"type":"text","text":" The channel of supply and demand information , It also provides platform users with "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Demand for supply and demand "},{"type":"text","text":"、"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Location "},{"type":"text","text":"、"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Product matching and quantity "},{"type":"text","text":"、"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Productivity "},{"type":"text","text":" and "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Transport efficiency "},{"type":"text","text":" And so on . The accumulated knowledge of supply and demand scenarios can be applied to the recommendation of this cloud service trade fair . On the other hand , We have also done a lot of homework on the understanding and completion of user portraits and distribution content portraits , Finally, ensure that the intelligent recommendation function of this cloud service trade fair will be successfully unveiled ."}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/d4/d41f7b64bae5018751316cd3f909bee7.webp","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" In the face of high-performance requirements of high-speed development , We designed it based on Caffeine and Redis Multi cache architecture for , Next, we will introduce from two aspects :"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"1,"},{"type":"text","marks":[{"type":"strong"}],"text":" Technology selection "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Why use Caffeine+Redis?Redis Needless to say , Everyone is so familiar with . Here we will focus on Caffeine,Caffeine It's based on "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Java8 Developed to provide near optimal hit ratio of high-performance cache Library ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" There are people here who will question again , Why not Guava Cache Well ? This kind of people are more familiar with based on LRU(The Least Recently Used) Isn't the localized cache implemented by the algorithm good ?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" although Guava Cache In the past, it was more widely used , The performance is also good , But in today's changing day , There will always be better 、 Better performance caching frameworks emerge —— It's like Caffeine. In addition, I would like to add , from Spring5(SpringBoot2) Start with Caffeine To replace Guava Cache."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Why? Caffeine Better performance ?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Let's start with the elimination algorithm ,Guava Cache It uses LRU.LRU The implementation is relatively simple , Daily use also has a good hit rate , It can "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Effective protection of hot data ,"},{"type":"text","text":" But for occasional or periodic visits , Can cause accidental data to be retained , And the real hot data is eliminated , Greatly reduce cache hit rate . So Caffeine Used Window TinyLFU Algorithm ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Talking about Window TinyLFU front , We also need to introduce LFU.LFU The algorithm solved "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"LRU For the problem of real hot data obsolescence caused by burst or periodic access ,"},{"type":"text","text":" But high frequency access to certain data over a short period of time , This will cause the data to stay in memory for a long time , And then when you trigger elimination , The newly added hot data is wrongly eliminated , Eventually it leads to a drop in the percentage of hits . in addition LFU Also need to maintain access frequency , Every visit needs to be updated , It's a huge resource overhead .Window TinyLFU It actually absorbed LRU and LFU The advantages of , And avoid their own shortcomings ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" The specific way is : First Window TinyLFU A record of recent visits , As a filter , When the new record comes , Only satisfaction TinyLFU Only the required records can be inserted into the cache . In order to solve the problem of high consumption of resources , It passes through "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"4-bit CountMinSketch"},{"type":"text","text":" Realization , This algorithm is similar to the bloon filter , Can use very small space to store a large number of access frequency data . This design gives each data item an opportunity to accumulate heat , Instead of filtering out . This avoids a sustained miss , Especially in the context of a sudden surge in traffic , Some short repeated traffic will not be retained for a long time . To refresh historical data , A time decay process is executed periodically or incrementally , Halve all the counters ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/85/85c7ed3c61cb1695ca7e2d2f84e63182.webp","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" And for long-term data ,W-TinyLFU Used Segmented LRU( abbreviation SLRU) Strategy . In the initial stage , Data will be stored in an item probationary segment in , In subsequent interviews , It will be moved to protected segment in . When protected segment When there is not enough memory , Some data will be eliminated back to probationary segment, It could also trigger again probationary segment The elimination of . This mechanism ensures that hot data with small access interval is saved , The cold data with less repeated access is recycled ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ea/ea001867d057a38668c8f16aed208937.webp","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" in addition to , stay caffeine Reading and writing in Chinese is done through asynchronous operation , Submit the event to the queue implementation , The data structure of the queue uses RingBuffer( High performance lockless queues Disruptor It's used RingBuffer), All writes share the same RingBuffer; While reading , The design idea of this piece is similar to Striped64, Each read thread corresponds to a RingBuffer, To avoid competition ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Here is the official performance test comparison :"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"1、 read (100%)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/5f/5fdafb77fb00fcd965a7ffa061e249ea.webp","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"2、 read (75%) / Write (25%)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/39/39ddbf5d982a2a3402c73c74bf003376.webp","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"3、 Write (100%)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/21/218c5bfad8594faf77b2fbd115b510ea.webp","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"2,"},{"type":"text","marks":[{"type":"strong"}],"text":" Multi level cache design "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Redis As a common cache , Although the performance is excellent , But as the amount of data grows , The complexity of the data structure , When stacking high concurrency scenarios , Whether it's the Internet IO Consumption of , still Redis Single node bottleneck , Will have a significant impact on the performance of the entire call chain . So we both need Caffeine As JVM Level cache , Also needed Redis As our secondary cache , This multi-level cache design can finally meet our needs ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" stay Java In the world , What we use most is based on Spring Cache To implement application caching , but Spring Cache Only a single cache source is supported , It can't satisfy the multi-level cache scenario . So we need to "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" By implementing CacheManager Interface to define your own multilevel cache CacheManager, At the same time, we need to realize our own Cache class ( Inherit AbstractValueAdaptingCache),"},{"type":"text","text":" There will be CaffeineCache and RedisTemplate Class and some related policy configurations are injected in , So we can achieve what we want get、put Method : Multi level cache read and write ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" In the design of data consistency , This mainly depends on "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Redis The publish and subscribe model of ,"},{"type":"text","text":" That is, all the updates 、 Delete all through this mode to inform other nodes to clean up the local cache , Of course, because CAP The relationship between , This design can't guarantee the strong consistency of data , So we can only ensure the final consistency of the data as much as possible ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/04/04cd79fb5c9bf6b2057726adfa362c4a.webp","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/25/25c312be60cd2072277ceea952405a0f.webp","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" In the exhibition cloud , We used "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" User portrait "},{"type":"text","text":"、"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Information portraits "},{"type":"text","text":"、"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Keyword matching "},{"type":"text","text":" And other technologies to achieve personalized recommendation . among , User profile is through the user's registration information 、 Interest tags 、 Browsing preferences and other data to build . Information portraits include exhibitor portraits 、 Portrait of the exhibition stand 、 There are four parts of the portrait of the exhibition and the portrait of the project , The first three parts construct each other and use each other's information , Such as the collection of exhibitors 、 Browsing and other data will add the corresponding booth and exhibits data of the enterprise , The industry information of exhibits needs to be obtained from the portraits of exhibitors , Three parts of data fusion modeling , And that creates a richer portrait ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Keyword matching technology is mainly used in "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Industry name and transaction type keyword matching , Through this technology, nonstandard information can be normalized ."},{"type":"text","text":" The system is also optimized for cold start scenarios , When users and information data are insufficient , The system can match the only user registration information with the exhibitor's industry information , And consider the heat of information to sort ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" The service trade fair has achieved personalized recommendation services for hundreds of thousands of users , For newly registered users and newly released information, it can also quickly realize intelligent recommendation through cold start scheme . The recommendation system uses the general "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Recall "},{"type":"text","text":" and "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Sorting structure "},{"type":"text","text":", The recall part will use "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Collaborative filtering "},{"type":"text","text":"、"},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Matrix decomposition "},{"type":"text","text":" Wait for the model , It can quickly screen candidate sets from massive data ; The sorting part adopts the more complex and accurate "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" Deep learning "},{"type":"text","text":" Model , Such as the industry commonly used Wide&Deep、DeepFM And other advanced models , Realize the precise sorting of each information in the candidate set , To provide accurate and stable services for users and exhibitors of the fair ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/fd/fdf02deb34f8eec1a9b96ff2aa737272.webp","alt":null,"title":"","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" In terms of model selection , We use DIN(Deep Interest Network) Model . Before introducing the model formally , Let's first introduce Attention Mechanism ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Attention The mechanism is "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" A solution to a problem that mimics human attention ,"},{"type":"text","text":" In short, it is to quickly screen out high-value information from a large amount of information , That is, a mechanism that aligns the internal experience with the external feeling to increase the observation precision of some regions ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" For example, when human vision processes an image , By scanning the global image quickly , Get the target areas that need to be focused on , It's the focus of attention . And then put more attention into this area , In order to get more detailed information about the goals that need to be focused on , And suppress other useless information . chart 1 Chinese vs Attention The mechanism is illustrated , The bright white areas indicate areas of more concern ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/56/56aeb3c330e2b2954c046ff0d507dd29.webp","alt":null,"title":"▲ chart 1 A diagram of the mechanism of attention ▲","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Attention The detailed calculation process of the mechanism is shown in figure 2. For most of the time Attention Method to abstract , It can be summarized into two processes 、 Three stages :"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" The first process is based on query and key Calculate the weight coefficient :"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(1) The first stage is based on query and key Calculate the similarity or correlation between the two ;"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"(2) The second stage normalizes the original score of the first stage ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" The second process is based on the weight coefficient value Weighted sum :"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/75/75332b7e3badc0ffc54f307a30cc85f2.webp","alt":null,"title":"▲ chart 2 Three stage calculation Attention The process ▲","style":[{"key":"width","value":"100%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" Use the correlation between the candidate products and the user's historical behavior to calculate a weight , This weight represents “ attention ” The strength of .DIN A local activation unit is designed , Activate unit accounting to calculate the latest products and users N Correlation weight of historical behavior commodities , And then take it as a weighted coefficient pair N It's a commodity embedding Vector do sum pooling."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" User interest is weighted by embedding To embody . The weight is determined by the candidate products and historical behaviors , The influence of the same candidate product on the historical behavior of different users is different , Historical behaviors with high relevance to candidate products will gain higher weight . You can see , The activation unit is a multi-layer network , Input as user portrait embedding vector 、 Information portraits embedding Vector and the cross product of the two ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"DIN The model is roughly divided into the following five parts :"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Embedding Layer:"},{"type":"text","text":" The original data is high dimensional and sparse 0-1 matrix ,emdedding Layer is used to compress the original high-dimensional data into a low-dimensional matrix ;"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Pooling Layer :"},{"type":"text","text":" Because different users have different numbers of behavior data , Lead to embedding The vector size of the matrix is inconsistent , The full connection layer can only handle fixed dimension data , So use Pooling Layer I get a fixed length vector ;"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Concat Layer:"},{"type":"text","text":" after embedding layer and pooling layer after , The original sparse feature is transformed into a number of fixed length abstract expression vectors of user interest , And then use it concat layer Aggregate abstract representation vector , Output the only abstract representation vector of the user's interest ;"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"MLP:"},{"type":"text","text":" take concat layer The abstract representation vector of the output is used as MLP The input of , Automatic learning of cross features between data ;"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"Loss:"},{"type":"text","text":" The loss function is usually used Logloss;"}]}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"DIN Think the user's interest is not a point , It's a multimodal function . A peak means an interest , The magnitude of the peak indicates the intensity of interest . So, for different candidate products , The intensity of user interest is different , That is to say, with the change of candidate products , The intensity of user interest is constantly changing ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" in general ,DIN adopt "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":" introduce attention Mechanism ,"},{"type":"text","text":" Different user abstract representations are constructed for different commodities , In this way, when the data dimension is fixed , Capture users' current interests more accurately ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" above , It is our technical support and thinking for the intelligent recommendation section of the service trade fair , As the first "},{"type":"text","marks":[{"type":"italic"},{"type":"strong"}],"text":"“ Never end ”"},{"type":"text","text":" Service Trade Association , Again , We dare not slacken our pace in the road of technology , Keep thinking, keep exploring , Don't forget the original idea, the future can be expected ."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" Recommended reading :"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s?__biz=MzU1OTgxMTg2Nw==&mid=2247494818&idx=1&sn=4b294480df8370df767388ecaa9988ff&scene=21#wechat_redirect","title":""},"content":[{"type":"text","text":" Online exhibition based on service design | Exhibition cloud technology interpretation "}]}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s?__biz=MzU1OTgxMTg2Nw==&mid=2247494756&idx=1&sn=3628afb8c6b0053d7e62b2ac94e643c3&scene=21#wechat_redirect","title":""},"content":[{"type":"text","text":" Multiple security guarantees escort the exhibition on the cloud | Exhibition cloud technology interpretation "}]}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"link","attrs":{"href":"https://mp.weixin.qq.com/s?__biz=MzU1OTgxMTg2Nw==&mid=2247494488&idx=1&sn=a100e4e684c83fe643c6b192eabd7134&scene=21#wechat_redirect","title":""},"content":[{"type":"text","text":" There is no place to hide the black production : Jingdong Zhilian cloud launched risk identification service "}]}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" Welcome to click "},{"type":"text","text":"【"},{"type":"link","attrs":{"href":"https://www.jdcloud.com/cn/cloudexpo/all?utm_source=PMM_infoQ&utm_medium=NAutm_campaign=ReadMoreutm_term=NA","title":""},"content":[{"type":"text","text":" Jingdong Zhilian cloud "}]},{"type":"text","text":"】"},{"type":"text","marks":[{"type":"strong"}],"text":", Learn about Jingdong mice cloud services "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" More wonderful technology practice and exclusive dry goods analysis "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":"center","origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":" Welcome to your attention 【 Jingdong Zhilian cloud Developer 】 official account "}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/77/77b9f9bae21f5a6033857fbf27a4b901.jpeg?x-oss-process=image/resize,p_80/auto-orient,1","alt":null,"title":"","style":[{"key":"width","value":"50%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
边栏推荐
- [performance optimization] Nani? Memory overflow again?! It's time to sum up the wave!!
- windows10 tensorflow(二)原理实战之回归分析,深度学习框架(梯度下降法求解回归参数)
- 如何在Windows Server 2012及更高版本中將域控制器降級
- 读取、创建和运行多个文件的3个Python技巧
- 神经网络简史
- H5打造属于自己的视频播放器(JS篇2)
- ES6精华:Proxy & Reflect
- 我们编写 React 组件的最佳实践
- 自然语言处理-错字识别(基于Python)kenlm、pycorrector
- iptables基礎原理和使用簡介
猜你喜欢
随机推荐
GUI 引擎评价指标
【數量技術宅|金融資料系列分享】套利策略的價差序列計算,恐怕沒有你想的那麼簡單
微信小程序:防止多次点击跳转(函数节流)
用Keras LSTM构建编码器-解码器模型
X Window System介紹
通过深层神经网络生成音乐
刚毕业不久,接私活赚了2万块!
如何在Windows Server 2012及更高版本中將域控制器降級
Dapr實現分散式有狀態服務的細節
让人怪不好意思的,粉丝破万,用了1年!
Python + Appium 自動化操作微信入門看這一篇就夠了
Probabilistic linear regression with uncertain weights
谁说Cat不能做链路跟踪的,给我站出来
面经手册 · 第12篇《面试官,ThreadLocal 你要这么问,我就挂了!》
tensorflow之tf.tile\tf.slice等函数的基本用法解读
连肝三个通宵,JVM77道高频面试题详细分析,就这?
vite + ts 快速搭建 vue3 專案 以及介紹相關特性
被产品经理怼了,线上出Bug为啥你不知道
阿里CCO项目组面试的思考
文本去重的技术方案讨论(一)