当前位置:网站首页>This article takes you to understand the commonly used models and frameworks of recommender systems
This article takes you to understand the commonly used models and frameworks of recommender systems
2022-08-02 09:34:00 【timerring】
可以看KDD会议,Recent Recommender Systems Papers.
推荐系统概述
传统推荐模型Old school Model
协同过滤模型
through the relationship between users,The information is filtered together with the user's evaluation feedback on the item,So as to find the information that the target user is interested in.

用户—A rating matrix for the item(The matrix is likely to be sparse)
| 用户\物品 | |||
|---|---|---|---|
| x | x | ||
| x | x | ||
| x | x |
Row vectors represent each user's preferences,A column vector indicating the attributes of each item
Based on a scoring matrix(行列)计算相似度,Here are some ways to calculate similarity:
- 余弦相似度
- 皮尔逊相关系数
- 欧氏距离
- 曼哈顿距离
There are mainly user-based collaborative filtering and item-based collaborative filtering.
矩阵分解模型
Matrix decomposition is the product of two low-rank matrices,By the inner product of the two matrices after decomposition,来填补缺失的数据.
优点:思路简单,Prediction can be done easily;
缺点:It is difficult to train incrementally(when the sample surges,May have to rebuild the matrix),Feature fusion is difficult;

这里kis a hidden factor,Equivalent to a hyperparameter.
逻辑回归模型
Predict whether users will“点击商品”进行分类.into a classification model.
ϕ ( x ) = w 0 + w 1 x 1 + ⋯ + w n x n = w 0 + ∑ i = 1 n w i x i \begin{aligned} \phi(x) &=w_{0}+w_{1} x_{1}+\cdots+w_{n} x_{n} \\ &=w_{0}+\sum_{i=1}^{n} w_{i} x_{i} \end{aligned} ϕ(x)=w0+w1x1+⋯+wnxn=w0+i=1∑nwixi
优点:模型简单,可解释性强,训练速度快(SGD梯度下降);
缺点:Model modeling capabilities are limited(没有考虑特征之间的相关性,and the intersection between features),Manual feature engineering is required;
特征交叉模型
PLOY2
ϕ ( x ) = w 0 + ∑ i = 1 n w i x i + ∑ i = 1 n − 1 ∑ j = i + 1 n w i j x i x j \phi(x) = w_{0}+\sum_{i = 1}^{n} w_{i} x_{i}+\sum_{i = 1}^{n-1} \sum_{j = i+1}^{n} w_{i j} x_{i} x_{j} ϕ(x)=w0+∑i=1nwixi+∑i=1n−1∑j=i+1nwijxixj
Violence is added to logistic regression二阶特征交叉.
优点:Add second-order features,Enhanced modeling capabilities;
缺点:时间复杂度高 n − − > n 2 n-->n^2 n−−>n2;
Factorization Machine
ϕ ( x ) = w 0 + ∑ i = 1 n w i x i + ∑ i = 1 n − 1 ∑ j = i + 1 n * v i , v j * x i x j \phi(x)=w_{0}+\sum_{i=1}^{n} w_{i} x_{i}+\sum_{i=1}^{n-1} \sum_{j=i+1}^{n}\left\langle v_{i}, v_{j}\right\rangle x_{i} x_{j} ϕ(x)=w0+∑i=1nwixi+∑i=1n−1∑j=i+1n*vi,vj*xixj
Add implicit weights to each feature(Inner product between two vectors),as the weight of feature intersection.
优点∶相比于PLOY2Reduced the amount of model parameters( n 2 − − > n K n^2-->nK n2−−>nK),自动特征工程
缺点︰Feature intersection is limited(二阶)
GBDT+LR
GBDT:作为特征编码器;It is mainly used for feature filtering and feature encoding of input data,Generate discrete feature vectors
LR(逻辑回归)︰Use the encoded results for training

优点︰灵活,Suitable for adding new features(Use tree model for feature combination)
缺点:The tree model has high complexity
深度推荐模型
Deep Collaborative Filtering(Neural CF )
Treat user ratings of items as a classification problem.
Learn user interactions with items using fully connected layers.

Replacing the matrix factorization operation with a multi-layer neural network
Using a fully connected network may be a little more efficient than multiplying.
Wide & Deep
基本淘汰
Wide为线性模型,Deepis a deep model
浅层模型(记忆能力)and deep model models(泛化能力),

WideParts can be rememberedid,Make a model of this.类似于LR.
DeepIt can be regarded as a fully connected network,类似于NCF.
DeepFM
DeepFM包含FM和DNN两部分,两部分共享输入特征.使用FM替换wide & Deep中的wide部分.
DeepFM:一阶特征+二阶特征+深度特征

Abandon the previous orderWide部分,用FM代替,Enhance the ability to combine shallow features,Substitute first and second order.
DIN
首个加入Attention机制
Adjust weights based on users and items

推荐系统框架&工具
DeepCTR
https://github.com/shenweichen/DeepCTR
https://github.com/shenweichen/DeepCTR-Torch
https://deepctr-torch.readthedocs.io/en/latest/Quick-Start.html
The classic recommendation algorithm model is implemented,支持Keras和Pytroch.
It is better to encapsulate the model and output processing,suitable for competition.
xlearn
https://github.com/aksnzhy/xlearn
https://xlearn-doc-cn.readthedocs.io/en/latest/
LR、FM、FFM的高效实现,Suitable for offline modeling use.
RecBole
伯乐,一个统一、全面、Efficient recommender system codebase
https://recbole.io/cn/
支持72个模型,28个数据集,Suitable for academic use

文本编码方法Text Encoding
Count:Count the number of text characters、单词个数
LabelEncoder:Unified labeling
Multi One-Hot:Multi-value label encoding(例如one-hotAdd after encoding)
AB : 011 BC : 110 AC : 101
One-Hot:eg:A: 0 0 1 B:010 C:100
CounterVector:与Multi One-Hot,But join the count
TfidfVectorizer: 次数 和 词频统计
Word2Vec:词向量映射,然后聚合
边栏推荐
- Rust 从入门到精通03-helloworld
- 记某社区问答
- Mistakes in Brushing the Questions 1-Implicit Conversion and Loss of Precision
- Talk about the understanding of Volatile
- Worship, Alibaba distributed system development and core principle analysis manual
- SVN下载上传文件
- 二维数组零碎知识梳理
- Spend 2 hours a day to make up for Tencent T8, play 688 pages of SSM framework and Redis, and successfully land on Meituan
- 一文带你了解推荐系统常用模型及框架
- Navicat连接MySQL时弹出:1045:Access denied for user ‘root’@’localhost’
猜你喜欢

自定义View实现波浪荡漾效果

Implementation of mysql connection pool

node制作一个视频帧长图生成器

边缘计算开源项目概述

js防抖函数和函数节流的应用场景

leetcode 62. Unique Paths(独特的路径)

Spend 2 hours a day to make up for Tencent T8, play 688 pages of SSM framework and Redis, and successfully land on Meituan

PyQt5安装配置(PyCharm) 亲测可用

Scala类型转换

Navicat连接MySQL时弹出:1045:Access denied for user ‘root’@’localhost’
随机推荐
Jenkins--基础--6.1--Pipeline--介绍
AutoJs学习-存款计算器
leetcode:81. 搜索旋转排序数组 II
在全志V853开发板试编译QT测试
Jenkins--部署--3.1--代码提交自动触发jenkins--方式1
LeetCode_2358_分组的最大数量
百数应用中心——选择一款适合企业的标准应用
node封装一个图片拼接插件
稳定币:对冲基金做空 Tether 的结局会是什么?
shell脚本
ORBSLAM代码阅读
新起点丨MeterSphere开源持续测试平台v2.0发布
WebGPU 导入[1] - 入门常见问题与个人分享
用汇编实现爱心特效【七夕来袭】
Jetpack Compose 中的状态管理
单机部署flink,创建oracle19c rac的连接表时报错 ORA-12505 ,怎么回事?
spark:热门品类中每个品类活跃的SessionID统计TOP10(案例)
用正向迭代器封装实现反向迭代器
适配器模式适配出栈和队列及优先级队列
CFdiv2-The Number of Imposters-(两种点集图上染色问题总结)