当前位置:网站首页>Cold start problem of recommended system
Cold start problem of recommended system
2022-06-12 07:35:00 【bugmaker.】
The recommendation system needs to predict the user's future behavior and interest according to the user's historical behavior and interest , So a lot of user behavior Data becomes an important part and prerequisite of recommendation system . How to design a personalized recommendation system without a large amount of user data and make users satisfied with the recommendation results so that they are willing to use the recommendation system , It is the problem of cold start . There are three main types of cold start problems , User cold start , Item cold start , System cold start .
User cold start
User cold start mainly solves the problem of how to make personalized recommendation for new users . When new users arrive , We don't have data on his behavior , So it's impossible to predict his interest based on his historical behavior , So I can't make personalized recommendation for him .
Use user registration information
The personalized recommendation process based on registration information is basically as follows :
(1) Get the user's registration information ;
(2) Classify users according to their registration information ;
(3) Recommend to users the items they like in their category .
Choose the right item to start the user's interest
Another way to solve the user cold start problem is when a new user first accesses the recommended system , Don't show the user the tweet immediately Recommendation results , But to provide users with some items , Let users feedback their interest in these items , Then, according to the feedback from users, I will give you
Item cold start
Item cold start mainly solves the problem of how to recommend new items to users who may be interested in it for personalized recommendation . Item cold start is very important in news websites and other websites with strong timeliness , Because there are new people's items in those websites all the time , And every Items must be able to be displayed to users at the first time , Otherwise, after a period of time , The value of the goods is greatly reduced .
about UserCF Algorithm
about UserCF The algorithm needs to solve the problem of the first driving force , Where the first user finds new items . As long as there is a small part People can find and like new things ,UserCF The algorithm can spread these items to more users . The easiest way to solve the first driving force is to randomly display new items to users , But it's obviously not very personal , Therefore, we can consider using the Content information , Put the new item first to users who have liked other items similar to its content .
about ItemCF Algorithm
about ItemCF Algorithm , Cold starting is a serious problem . because ItemCF The principle of the algorithm is to recommend items similar to the items he likes before ,ItemCF Every once in a while, the algorithm uses user behavior to calculate the item similarity table ( It is usually calculated once a day ). therefore , When a new item is added , This item will not exist in the item related table in memory , If new items are not displayed to users , Users can't act on it , thus ItemCF Unable to recommend new products . So , We can only use the content information of the item to calculate the item related table , And update related tables frequently .
System cold start
System cold start mainly solves how to build a new website ( No users yet , It doesn't work Household behavior , There's only some information about the items ) Design personalized recommendation system , Thus, when the website is just released, the user body Experience the problem of personalized recommendation service .
Play an expert role
A lot of recommendation systems are built , There is no user behavior data , There is not enough content information to calculate the exact similarity of items . that , In order to let users get a better experience when the recommendation system is established , Many systems use expert tagging .
as everyone knows , It is difficult to calculate the similarity between music . First , Music is multimedia , If we calculate the similarity between songs from the audio analysis , The technical threshold is very high , And it is difficult to calculate satisfactorily . secondly , Just use the album of songs 、 It is difficult to obtain a satisfactory song similarity table from attribute information such as singers , Because a singer 、 An album often has only oneortwo good songs . To solve this problem ,Pandora hire
A group of computer literate musicians were used to carry out a project called music gene . They listened to the songs of tens of thousands of singers , And annotate each dimension of these songs . Final , They used 400 Multiple features Pandora Call these traits genes ). After marking all the songs , Each song can be expressed as a 400 Dimension vector , Then the similarity of songs can be calculated by the common vector similarity algorithm .
Jinni In the film genetic engineering, semi artificial 、 Semi automatic mode . First , It allows experts to mark films , Every movie has about 50 A gene , These genes come from about 1000 A gene bank . then , After experts mark certain samples ,Jinni Be able to use natural language understanding and machine learning technology , By analyzing the user's comments on the movie and some content attributes of the movie ( Especially new movies ) Make your own mark . All in all ,Jinn The cold start problem of the system is solved by combining expert and machine learning .
边栏推荐
- R语言使用epiDisplay包的summ函数计算dataframe中指定变量在不同分组变量下的描述性统计汇总信息并可视化有序点图、使用dot.col参数设置不同分组数据点的颜色
- Shortcut key modification of TMUX and VIM
- Voice assistant - Measurement Indicators
- xshell安装
- 速度自关联函数—LAMMPS V.S MATALB
- Personalized federated learning using hypernetworks paper reading notes + code interpretation
- RT thread studio learning (I) new project
- Missing getting in online continuous learning with neuron calibration thesis analysis + code reading
- Chapter 4 - key management and distribution
- The first demand in my life - batch uploading of Excel data to the database
猜你喜欢

Summary of machine learning + pattern recognition learning (V) -- Integrated Learning

Chapter V - message authentication and digital signature

Node, topic, parameter renaming and global, relative and private namespaces in ROS (example + code)

The function of C language string Terminator

C language sizeof strlen

2022 electrician (elementary) examination question bank and simulation examination

RT thread studio learning (x) mpu9250

AI fanaticism | come to this conference and work together on the new tools of AI!

Explain ADC in stm32

右击文件转圈卡住、刷新、白屏、闪退、桌面崩溃的通用解决方法
随机推荐
Detailed explanation of coordinate tracking of TF2 operation in ROS (example + code)
Leetcode34. find the first and last positions of elements in a sorted array
GD32F4(5):GD32F450时钟配置为200M过程分析
Summary of software testing tools in 2021 - unit testing tools
RT thread studio learning (x) mpu9250
R语言使用neuralnet包构建神经网络回归模型(前馈神经网络回归模型),计算模型在测试集上的MSE值(均方误差)
Modelarts training task 1
Why must coordinate transformations consist of publishers / subscribers of coordinate transformation information?
Class as a non type template parameter of the template
2022 electrician (elementary) examination question bank and simulation examination
R语言dplyr包mutate_at函数和one_of函数将dataframe数据中指定数据列(通过向量指定)的数据类型转化为因子类型
VS 2019 MFC 通过ACE引擎连接并访问Access数据库类库封装
Modelarts培训任务1
8086/8088 instruction execution pipeline disconnection reason
2022 G3 boiler water treatment recurrent training question bank and answers
Voice assistant -- Qu -- query error correction and rewriting
Summary of machine learning + pattern recognition learning (I) -- k-nearest neighbor method
‘CMRESHandler‘ object has no attribute ‘_timer‘,socket.gaierror: [Errno 8] nodename nor servname pro
Voice assistant - future trends
SQL -- course experiment examination