当前位置:网站首页>Understand the recommendation system in one article: Outline 02: The link of the recommendation system, from recalling rough sorting, to fine sorting, to rearranging, and finally showing the recommend

Understand the recommendation system in one article: Outline 02: The link of the recommendation system, from recalling rough sorting, to fine sorting, to rearranging, and finally showing the recommend

2022-08-03 16:35:00 Ice Dew Coke

Understand the recommendation system in one article:概要02:Links to recommender systems,Recall from coarse row,to fine row,to rearrange,Finally recommended display to the user

提示:A recent course on systematically studying recommender systems.We in the little red book scene, for example,Recommender systems in industry.
I only talk about technologies that are actually useful in industry.说实话,Technology in industry is far ahead of academia,See the book in open channels、The paper has a lot to do with industry practicegap,
You can't learn the key technologies of recommendation systems by reading books.
You can't learn the key technologies of recommendation systems by reading books.
You can't learn the key technologies of recommendation systems by reading books.

Shu-sen wang explain**《Xiaohongshu's recommendation system》**
GitHub资料连接:http://wangshusen.github.io/
B站视频合集:https://space.bilibili.com/1369507485/channel/seriesdetail?sid=2249610


提示:文章目录


Links to recommender systems

We continue to learn the basic concepts of recommender systems.

This section is a link to recommender systems.

The link of the recommender system is divided into recall、粗排、精排、重排
在这里插入图片描述

Just a brief introduction of this class,The following courses will explain each part in detail.

第一步是召回,Quickly retrieve some items from the item database,
For example, Xiaohongshu has hundreds of millions of notes,When the user refreshes the little red book,
System will call recalled dozens of channels at the same time,Each channel recall back dozens to hundreds of paper notes,Retrieved a total of thousands of notes.

After I finish the recall,The next step is to select from thousands of notes that the user is most interested in.

下一步是粗排,Use smaller machine learning models,Grading thousands of notes one by one,
Sort and truncate by score,Keep the hundreds of notes with the highest scores,

再下一步是精排.Here we will use a large-scale deep neural network to score hundreds of notes one by one.
Refinement scores reflect user interest in notes,Can do stages after spermatogenesis,You can also do nothing else.

We Xiaohong said that the fine row does not do other stages,All these hundreds of notes with fine lines,Scores enter rearrangement.

重排是最后一步.Here will be according to the fine line score and diversity score do random sampling,Get dozens of notes,
Then break up similar content,And insert ads and operation content,展示给用户.

This is a general idea of the recommendation system,Will be explained in this a few links below.

The goal of the recommender system is to select dozens of items from the database of items to display to the user.
In the scene of our little red book,items are notes.

We have hundreds of millions of notes The first loop on the recommendation system circuit is recall,is to quickly retrieve some notes from the notes database.
在实践中,The recommender system has many recall channels.
在这里插入图片描述

Common include system filtering、双塔模型、Authors to follow, etc..
在这里插入图片描述

Such as the little red book recommendation system has dozens of recall,Each channel recall back dozens to hundreds of paper notes,These recall channels collectively return several thousand notes,
Recommendation system will then merge these notes,and do deduplication and filtering.
Filtering mean exclude users don't like it,The authors do not like it,Notes on disliked topics,After recovering thousands of notes,The next step is to do the sorting.

在这里插入图片描述
Ranking uses machine learning models to estimate user interest in notes,Keep the note with the highest score.
If you directly use a large-scale neural network to score thousands of notes one by one,It will cost a lot.

In order to solve the problem of calculation,The sorting is usually divided into two steps: rough sorting and fine sorting..

Rough row quickly grade thousands of notes with a simpler model,Keep the hundreds of notes with the highest scores.
Refinement uses a large neural network to score hundreds of notes,The fine line model is much larger than the rough model,more features,

Therefore, the score of the refined ranking model is more reliable.,However, the amount of calculation of fine sorting is very large.
That's why we filter with coarse row first,Then use fine,Doing so can better balance the amount of calculation and accuracy.
在这里插入图片描述

Get hundreds of notes after rough and fine sorting,Each note has a score,Indicates how interested the user is in the note,
You can directly sort the notes according to the score of the model,然后展示给用户.
在这里插入图片描述

However, there are still some shortcomings in the results at this time.,需要做一些调整.

在这里插入图片描述
This step is called rearrangement,rearrangement is mainlyConsider diversity,
random sampling based on diversity,Select dozens of notes from hundreds,
Then use the rulesContent of similar notes scattered.

I'll explain later rearrangement,The result of the rearrangement is the item that is finally displayed to the user,
比如把前80Items to show to the user,which includes notes and advertisements.
在这里插入图片描述

我说一下,The numbers here are all random.,I'm not too convenient to speak little red book of real Numbers,
Below I will briefly introduce the coarse and fine line of models,Coarse and fine are very similar,
The only difference is that the fine row model is bigger,more features.

The input to the model includes user features、The characteristics of the candidate items,There are statistical characteristics.

If we want to judge whether Xiao Wang is interested in a certain note,We're going to characterize the note、King's characteristics,There are also many statistical features fed into the neural network.

There are various structures of neural networks,这里就不展开讲了,Save it for a later class.
The neural network will output a lot of values,比如点击率、点赞率、收藏率、转发率,These values ​​are all estimates of user behavior by the neural network..
在这里插入图片描述
The greater the numerical,Indicates that users are more interested in notes,
Finally, the multiple estimates are fused,得到最终的分数.
For example, the weighted sum of this score determines whether the note will be displayed to the user,and whether the notes are displayed at the front or the back.

请注意,This is just for a grade thick line of the note,To grade thousands of notes,Refinement requires scoring hundreds of notes.
Each note has multiple estimated scores,merge into one score,As you this article notes the basis of a sort.

The last link on the recommender chain is rearrangement,The most important function of rearrangement is diversity sampling.
Need to select dozens of notes from hundreds of notes,常见的方法有MMR和DPPThere are two reasons for sampling,
One basis is the size of the refined score,Another basis is diversity.
在这里插入图片描述

After sampling,Similar content will be broken up with rules.
We cannot put too similar notes on the adjacent location.
举个例子,Points based on gold medals,The top five notes are allNBA的内容,这样就不太合适.
even if the user is a basketball fan,He doesn't necessarily want to see homogeneous content.

If the row is the firstNBA的笔记,Then can't put several placesNBA的内容,Similar notes will be moved back.

Another purpose of rearrangement is to insert ads and operations,The content of the promotion should also be adjusted and sorted according to the ecological requirements.,For example, you can't connect a lot of beautiful pictures.


okayTo summarize this section,This lesson briefly introduces links to recommender systems:

The first link on the link is recall,We have a lot of recall channels,Quickly retrieve thousands of notes from hundreds of millions of notes as candidate sets,
Then let the sorting decide which notes to expose to the user,And show what is order,Sort in steps.

First is rough,Score thousands of notes with a small-scale neural network,Select the hundreds of articles with the highest scores and send them to the refinement.
当然,There will also be some rules to ensure that the notes entered into the refined arrangement are diverse.

Next is fine,Using a large-scale neural network to score hundreds of rough-chosen notes,打完分之后,No need to do sorting and staging.

These hundreds of notes will be finely arranged,Scores all go to rearrangement、Rearrangement will do diversity sampling,Select dozens of notes from hundreds.
then break up with rules,And insert ads and operation content.
The rules for rearrangement are very complex,There are thousands of lines of code,
在这里插入图片描述

The rough row recalled along the entire link is the biggest funnel.They changed the number of candidate notes from hundreds of millions to thousands,and then into a few hundred.
When there are only a few hundred candidate notes,In order to use large-scale neural network to do fine sorting,才能用DPPDiversity sampling in this way.
If the number of notes is too large,It is impossible to use large-scale neural networks andDPP.


总结

提示:How to systematically learn recommender systems,This series of articles can help you

(1)Applying for a job resume,You need to match the job requirements of the recruiting unit with your research direction and work content,It can meet the company's recruitment needs,Otherwise it hung his resume to you directly
(2)What do you do recommendation system direction is to enter this company?还是纯cv方向?还是NLP方向?or voice direction?Still in the middle of deep learning machine learning technology?还是硬件?还是前端开发?后端开发?测试开发?产品?人力?行政?You can't do everything,you need to find a direction,own accumulation,to deliver,Otherwise, what will the interviewer talk to you about??
(3)Recommendation system learning experience today:The goal of the recommender system is to select dozens of items from the database of items to display to the user,The link of the recommender system is divided into recall、粗排、精排、重排,In order to solve the problem of calculation,The sorting is usually divided into two steps: rough sorting and fine sorting..

原网站

版权声明
本文为[Ice Dew Coke]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/215/202208031551490612.html