当前位置:网站首页>7. Introduction to field sensing decomposing machine FFM
7. Introduction to field sensing decomposing machine FFM
2022-06-13 12:10:00 【nsq1101】
Preface
FFM Algorithm , The full name is Field-aware Factorization Machines, yes FM(Factorization Machines) Improved version .
source :
The original concept came from Yu-Chin Juan( Ruan Yuqin , Graduated from Taiwan University of China , Now America Criteo Work ) With the players in the game , It's them that draw lessons from Michael Jahrer In your paper field The concept of FM An upgraded version of the model . By introducing field The concept of ,FFM Attribute the same property to the same field.
1、 FFM principle
stay CTR Under estimation , We usually meet one-hot Variable of type , It will lead to sparse data features . It's not solved ,FFM stay FM Further improvement on the basis of , Introduce the concept of category into the model , namely field. Will be the same field The characteristics of the individual one-hot, So in FFM in , Each one-dimensional feature is targeted at each of the other features field, Learn a hidden variable separately , Cain variables are not only related to features , Also with the field relevant .
1.1 introduce field
With feed Take the stream recommendation scenario as an example , We introduce more user Dimension user age information , Both gender and age belong to user Dimensional features , and tag Belong to item Dimensional features . stay FM The principle is being explained ,“ men ” And “ Basketball ”、“ men ” And “ Age ” The potential effect is the same by default , But in fact, it is not necessarily .FM The algorithm cannot capture this difference , Because it does not distinguish between broader categories field The concept of , Instead, the dot product of the same parameter will be used to calculate .
stay FFM(Field-aware Factorization Machines ) Each one-dimensional feature in (feature) All belong to a particular and field,field and feature It's a one-to-many relationship . As shown in the following table :

1.2 Combination features

2、 Practical examples
mac :
- pip install cmake
- Prompt to install things , Click agree
- pip install xlearn
- After success, execute the following example
import xlearn as xl
# Training
ffm_model = xl.create_ffm() # Use FFM Model
ffm_model.setTrain(r"./small_train.txt") # Training data
ffm_model.setValidate(r"./small_test.txt") # Verify test data
# param:
# 0. binary classification
# 1. learning rate: 0.2
# 2. regular lambda: 0.002
# 3. evaluation metric: accuracy
param = {'task': 'binary', 'lr': 0.2,
'lambda': 0.002, 'metric': 'acc'}
# Start training
ffm_model.fit(param, './model.out')
# forecast
ffm_model.setTest(r"./small_test.txt") # Test data
ffm_model.setSigmoid() # normalization [0,1] Between
# Start to predict
ffm_model.predict("./model.out", "./output.txt")
xlearn User manual
https://xlearn-doc-cn.readthedocs.io/en/latest/python_api/index.html#id4
3、 FFM application
3.1 Scenario introduction
stay DSP Or in the recommended scenario ,FFM It is mainly used to evaluate CTR and CVR, That is, the potential click rate and conversion rate of a user to a product .
CTR and CVR Prediction models are trained offline , Then predict online . The two models adopt similar characteristics , There are three main categories :
- User related features
Age 、 Gender 、 occupation 、 Interest in 、 Category preference 、 Browse / Basic information such as purchase categories , And the number of recent hits / Purchase volume / Consumption and other statistical information - Commodity related features
Category of goods 、 sales 、 Price 、 score 、 history CTR/CVR Etc - user - Product matching features
Browse / Buy category matching 、 Browse / Buy business matching 、 Interest preference matching, etc
In order to apply FFM Model , All features have to be converted into "field_id:feat_id:value" The format of ,field_id Represents the characteristic field The number of ,feat_id It's the feature number ,value Is the value of the feature .
- Numerical features are easier to handle , Just assign separate field Number , E.g. user rating 、 The history of commodities CTR/CVR etc. .
- categorical Features need to go through One-Hot Code is converted to numeric type , All features generated by coding belong to the same field, And the value of the characteristic can only be 0 or 1, For example, the user's gender 、 age group 、 The category of goods, etc .
- besides , There's a third kind of feature , Such as user browsing / Buy categories , There are many categories id And use a number to measure the number of products that users browse or buy in each category . This kind of characteristic follows categorical Feature handling , The difference is that the eigenvalue is no longer 0 or 1, It represents the number of users browsing or purchasing .
- According to the above method field_id after , Then number the transformed features in sequence , obtain feat_id, The value of the feature can also be obtained according to the previous method .
CTR、CVR The categories of estimated samples are obtained in different ways .CTR The estimated positive sample is the users who click on the website - Product records , negative # Samples are displayed but not clicked records ;CVR The positive sample of estimation is the payment inside the station ( Transformation ) Users of - Product records , Negative samples are records that are clicked but not paid .
3.2 Minutiae
Training FFM In the process of , Small details deserve special attention
First of all , Sample normalization .
FFM The default is to normalize the sample data , namely pa.normpa.norm It's true ; If this parameter is set to false , It's easy to create data inf overflow , And then it leads to the gradient calculation nan error . therefore , Sample level data is recommended for normalization .second , Feature normalization .
CTR/CVR The model uses a variety of source characteristics , Including numerical and categorical Type, etc . however ,categorical The value of the feature after class coding is only 0 or 1, The larger numerical characteristics will cause the normalization of samples categorical The values of class generated features are very small , There is no distinction . for example , A user - Product records , The user is “ male ” sex , The sales of goods are 5000 individual ( Assume that the value of other features is zero ), Then the normalized feature “sex=male”( Gender is male ) Is slightly less than 0.0002, and “volume”( sales ) The value of is approximately 1. features “sex=male” The role in this sample is almost negligible , This is quite unreasonable . therefore , Normalize the values of source numerical features to [0,1][0,1] It's very necessary .Third , Omit zero value features .
from FFM The expression of the model shows that , Zero value features have no contribution to the model at all . Both the first-order term and the combination term containing zero value features are zero , It has no effect on the estimation of training model parameters or target values . therefore , Zero value features can be omitted , Improve FFM The speed of model training and prediction , This is also a sparse sample using FFM The obvious advantage of .
4、FFM vs FM
- FM yes FFM The special case of
- FFM stay FM On the basis of that, the paper puts forward field The concept of ,FM Each feature in the model has only one hidden vector , and FFM Then there are multiple hidden vectors , Dot multiplication is based on the corresponding field Make a selection
- In terms of computational complexity ,FM The complexity can be reduced to O(kn), and FFM It is O(kn^2)
FFM Advantages and disadvantages
- FFM advantage :
increase field The concept of , The same feature is for different field Use different hidden vectors , Model modeling is more accurate - FFM shortcoming :
The computational complexity is relatively high , The number of parameters is nfk, The calculation complexity is O(kn2)
5、 summary
- Analyze theoretically ,FFM The parametric factorization method of has some significant advantages , Especially suitable for dealing with the problem of sample sparsity , And ensure better performance ;
- From the application results , Station CTR/CVR It is estimated that FFM It's very reasonable , All the indicators show that FFM Excellent performance in click through estimation .
边栏推荐
- Machine learning (IV) - PCA (principal component analysis) theory and code explanation
- 【TcaplusDB知识库】TcaplusDB分析型文本导出介绍
- 复习指南,学生党必看
- 『忘了再学』Shell基础 — 30、sed命令的使用
- Interview shock 56: what is the difference between clustered index and non clustered index?
- 003、torchserve 调用LSTM模型预测
- Machine learning (III) - LDA (linear discriminant analysis) theory and code explanation
- [tcaplusdb knowledge base] Introduction to tcaplusdb tcapulogmgr tool (I)
- The answer to the subject of Municipal Administration of the second construction company in 2022 has been provided. Please keep it
- Composition of pulsar messages
猜你喜欢

Notes on the development of raspberry pie (16): Raspberry pie 4b+ install MariaDB database (MySQL open source branch) and test basic operations

Chenhongzhi: bytegraph, a trillions level graph database developed by byte beating and its application and challenges

我和指针那些事——初识指针

web开发者,web开发后台开发

陈宏智:字节跳动自研万亿级图数据库ByteGraph及其应用与挑战

(一)爬取Best Sellers的所有分类信息:爬取流程

The answer to the subject of Municipal Administration of the second construction company in 2022 has been provided. Please keep it

2022年二建《法规》科目答案已出,请收好

SMS based on stm32f103+as608 fingerprint module +4x4 matrix key +sim900a - intelligent access control card system

Books + videos + learning notes + skill improvement resource library, interview must ask
随机推荐
What if the second construction fails to pass the post qualification examination? This article tells you
Details of fitfi sports money making chain game system development mode
内部寄存器类型
面试突击56:聚簇索引和非聚簇索引有什么区别?
Fuel scheme and product business modeling
Query the current number of computer CPU cores
[truth] the reason why big factories are not afraid to spend money is...
[MySQL lock table processing]
Docker dockerfile installation mysql5.7
Machine learning (III) - LDA (linear discriminant analysis) theory and code explanation
Wallys/Network_Card/DR-NAS26/AR9223/2x2 MIMO
如何使用 DATAX 以 UPSERT 语义更新下游 ORACLE 数据库中的数据?
OpenCV学习笔记(二):读取mnist数据集
书籍+视频+学习笔记+技能提升资源库,面试必问
我们的B端SaaS为什么生存得如此艰难
Pulsar producer
docker Dockerfile安装mysql5.7
产品故事|你所不知道的语雀画板
行业领先的界面组件包DevExpress 6月正式发布v21.2.8
CPU的分支预测