当前位置:网站首页>What is a good recommendation system?

What is a good recommendation system?

2022-06-12 07:35:00 bugmaker.

What is a recommendation system

Recommendation system is a tool to automatically contact users and items , It can help users find information that interests them in an information overload environment , It can also push information to users who are interested in them .

The difference between recommendation system and classified directory and search engine

as everyone knows , To solve the problem of information overload , Countless scientists and engineers have come up with brilliant solutions , The representative solutions are classified directory and search engine . These two solutions gave birth to two famous companies in the Internet field, Yahoo and Google . Yahoo, a famous Internet company, started with a classified directory , And now there are more famous classified directory websites abroad DMOZ、 The domestic Haol23 etc. . These directories classify famous websites , Thus, it is convenient for users to find websites according to categories . But as the scale of the Internet continues to expand , Classified directory websites can only cover a small number of popular websites , More and more can not meet the needs of users . therefore , The search engine was born . The search engine represented by Google allows users to find the information they need through search keywords . however , Search engines require users to actively provide accurate keywords to find information , Therefore, it can not solve many other needs of users , For example, when users cannot find keywords that accurately describe their needs , There's nothing search engines can do . Like search engines , Recommendation system is also a tool to help users quickly find useful information . Unlike search engines , Recommendation system does not require users to provide clear requirements , But by analyzing the user's historical behavior to model the user's interest , In order to actively recommend information to users that can meet their interests and needs . therefore , In a sense , Recommender system and search engine are two complementary tools for users . The search engine meets the users' active search needs when they have a clear purpose , The recommendation system can help users find new content they are interested in when they don't have a clear purpose .

The type of recommendation system

  1. Consult a friend . We might turn on the chat tool , Find some good friends who often watch movies , Ask them if they have any movies to recommend . even to the extent that , We can open Weibo , Make a statement “ I want to see a movie ”, Then wait for the enthusiasts to recommend the movie . This method is called in the recommendation system Social recommendation (social recommendation), Let your friends recommend things to you .
  2. We usually have our favorite actors and directors , Some people might turn on the search engine , Lose your favorite actor name , Then look at the returned results and see what movies you haven't seen . For example, I like Stephen Chow's movies very much , So I went to Douban to search for Zhou Xingchi , I found out that I haven't seen a movie of his early years , So I'll take a look . This way is to find movies that are similar in content to the movies you've seen before . Recommender system can automate the above process , Find the actors and directors that users like by analyzing the movies that users have seen , Then recommend other films by these actors or directors to users . This recommendation method is called in the recommendation system Content based recommendation (content-basedfiltering).
  3. We may also look at the leaderboard , Like the famous IMDB Movie charts , See what other people are watching , What movies do people like , And then find a movie that's well received . This approach can be further extended : If you can find a group of users with similar historical interests , Look at the movies they've been watching recently , So the results may be more in line with your interests than the broad hot list . This is called Based on collaborative filtering (collaborative filtering ) The recommendation of .

Application of personalized recommendation system

Electronic Commerce 、 Movie and video sites 、 Social networks 、 advertisement
** Music software :** Music recommendation algorithm mainly comes from a project called music genetic engineering . This project starts from 2000 year 1 month 6 Japan , Its members include musicians and engineers who are interested in music . The algorithm is mainly based on content , Its musicians and researchers have personally listened to tens of thousands of songs from different singers , And then the different characteristics of the song ( Like melody 、 rhythm 、 Arrangement and lyrics, etc ) Annotate , These labels are called the genes of music . then , The algorithm calculates the similarity of songs according to the genes marked by experts , And recommend other music that is genetically similar to his previous favorite music .

Recommend system evaluation

Recommended experimental methods

Offline experiments

The method of off-line experiment generally consists of the following steps :
(1) Obtain user behavior data through the log system , And generate a standard data set according to a certain format ;
(2) The data set is divided into training set and test set according to certain rules ;
(3) Train user interest model on training set , Make predictions on the test set ;
(4) Through the pre-defined off-line index evaluation algorithm in the test set prediction results .
 Insert picture description here

User surveys

There is a gap between the indicators of the offline experiment and the actual business indicators , For example, there is a big difference between prediction accuracy and user satisfaction , High prediction accuracy is not equal to high user satisfaction . therefore , If you want to accurately evaluate an algorithm , Need a relatively real environment . The best way is to test the algorithm directly , But when we are not sure whether the algorithm will reduce user satisfaction , Online testing has a high risk , So before going online, you need to do a test called user survey .
User surveys need to have some real users , Let them complete some tasks on the recommendation system that needs to be tested . When they finish the task , We need to observe and record their behavior , And ask them to answer some questions . Last , We need to understand the performance of the test system by analyzing their behavior and answers . In most cases, it is difficult to conduct large-scale user surveys , And for the user survey with a small number of participants , Many of the conclusions drawn are often of no statistical significance . therefore , When we do user surveys , On the one hand, we should control the cost , On the other hand, it is necessary to ensure the statistical significance of the results . Besides , Test users do not choose randomly . Try to ensure that the distribution of test users is the same as that of real users , For example, half men and half women , And age 、 The distribution of activity is the same as that of real users . Besides , The user survey should try to ensure that it is a double-blind experiment , Don't let the experimenters and users know the test objectives in advance , To avoid the user's answers and the test of the experimenters affected by subjective components .
The advantages and disadvantages of user surveys are also obvious . Its advantage is that it can obtain many indicators that reflect users' subjective feelings , Relative to online experiments, the risk is very low , It's easy to make up for mistakes . The disadvantage is that recruiting test users is expensive , It's hard to organize large-scale test users , Therefore, the statistical significance of the test results will be insufficient . Besides , In many cases, it is very difficult to design a double-blind experiment , Moreover, the behavior of users in the test environment may be different from that in the real environment , Therefore, the test indicators collected in the test environment may not be reproduced in the real environment .

Online experiments

After completing offline experiments and necessary user surveys , You can put the recommendation system online AB test , Compare it with the old algorithm .
AB Testing is a very common online evaluation algorithm experimental method . It randomly divides users into several groups by certain rules , And different algorithms are used for different groups of users , Then compare different algorithms through different evaluation indexes of different groups of users , For example, you can count the click rate of different groups of users , Compare the performance of different algorithms through click through rate .
AB The advantage of testing is that the performance index of different algorithms can be obtained fairly when they are actually online , Including indicators of business concern .AB The main disadvantage of the test is that the cycle is longer , Long term experiments are necessary to get reliable results . Therefore, it is generally not used AB Test all the algorithms , It is only used to test algorithms that perform well in offline experiments and user surveys . secondly , A large website AB The design of test system is also a complex project . The architecture of a large website is divided into front-end and back-end , From the interface shown to the user at the front end to the algorithm at the end , There are many layers in the middle , These layers are often controlled by different teams , And it is possible to do AB test . If you design for different layers AB The test system , So different AB Tests often interfere with each other . such as , When we do a background recommendation algorithm AB test , At the same time, the web team is working on the interface of the recommendation page AB test , The end result is that you don't know that the test result is a change in your algorithm , It is also caused by the change of the recommended interface . therefore , The cut flow is AB The key to the test , The different layers and the teams that control them need to get themselves from a unified place AB Test flow , And the flow between different layers should be orthogonal .

Evaluation indicators

User satisfaction

Prediction accuracy

Prediction accuracy measures the ability of a recommendation system or algorithm to predict user behavior . This index is the most important offline evaluation index of recommendation system , From the day the recommendation system was born , almost 99% This indicator has been discussed in the papers related to recommendation .
 Insert picture description here
About RMSE and MAE The advantages and disadvantages of these two indicators ,Netflix Think RMSE Increased penalties for user items with inaccurate predictions ( Penalty of square term ), So the evaluation of the system is more rigorous . Studies have shown that , If the scoring system is based on integers ( That is to say, the scores given by users are integers ), So rounding the forecast results will reduce MAE The error of the

coverage

coverage (coverage) Describe the ability of a recommendation system to discover the long tail of an item . There are different ways to define coverage , The simplest definition is the proportion of items recommended by the recommendation system in the total item collection . Suppose the user set of the system is tA The recommendation system recommends a length of TV The coverage of the recommendation system can be achieved through the following
Formula calculation :

diversity 、 Novelty 、 Surprise degree 、 Trust degree 、 The real time 、 Robustness,

原网站

版权声明
本文为[bugmaker.]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/03/202203010556252541.html