当前位置:网站首页>Esmm reading notes
Esmm reading notes
2022-06-29 00:51:00 【Staring foreshadowing】
The paper :《Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate》 Ali ,2018
1.motivation
differ CTR Estimate the problem ,CVR There are two key issues in the estimation :
Sample Selection Bias (SSB) Conversion is after clicking “ There may be ” What happened , Tradition CVR The model usually takes click data as training set , The click is not converted to a negative example , Click and convert to positive example . But when the model predicts , It's an estimate of the entire space , Instead of just estimating the click samples . That is , The training data and the actual data to be predicted come from different distributions , The training data is the data of the click Set , The forecast data is full data . Use a finite subset of data to predict the entire sample , There will be large deviations . To see only one spot , It can be seen .

chart 1 SSB Sketch Map
Data Sparsity (DS) As CVR The click sample of training data is far less than CTR Estimate exposure samples used for training . Data sparsity is fitting cvr A difficulty in modeling .
Some strategies can alleviate these two problems , For example, from the exposure set to unclicked Sample sampling for negative case mitigation SSB, Oversampling of transformed samples DS etc. . But either way , Not very elegant Effectively solve any of the above problems .
You can see : Click on —> conversion , Itself is two strongly related continuous behaviors , The author hopes to show that this kind of “ Behavior chain relationship ”, Thus, training and prediction can be carried out in the whole space . This involves CTR And CVR Two tasks , So use multitasking (MTL) It's a natural choice , The key highlight of the paper is “ How to build ” This MTL.
2.model
From the perspective of modeling SSB and DS problem .
First, clarify three concepts :
CTR : Exposed , Click probability
CVR : If item By clicking the , So the probability of its transformation . Pay attention to the assumptions “ If it's clicked ”
CTCVR :item By clicking the , The probability of its transformation
It is not possible to train directly with all samples CVR The reason for the model is :
I do not know! unclicked Of item, Suppose they were user Click. , Whether they will be transformed . If used directly 0 As their label, Will be largely misleading CVR Model learning .
Can't get “unclicked Of item, Suppose they were user Click. ” This information , Give to the CVR Model fitting in progress .

among y, z respectively conversion and click. be aware , In all sample spaces ,CTR Corresponding label by click, and CTCVR Corresponding label by click & conversion

chart 2. ESMM Network structure
cvr The model is built on all sample spaces , That is, you can use all the data to fit cvr Model ;ctr and cvr The lower layer of the network is
Take a closer look at the picture above , Pay attention to the following points :1) share Embedding CVR-task and CTR-task Use the same features and features embedding, That is, both from Concatenate After that, I will learn the exclusive parameters of each part ;2) Implicit learning pCVR What do you mean ? here pCVR( Pink node ) Just one of the networks variable, There is no monitoring signal shown .
It is using CTCVR and CTR Monitoring information to train the network , Learn implicitly CVR, That's exactly what it is. ESMM The essence of , As for the necessity and rationality of doing so
Think again ,ESMM The structure of is based on “ ride ” Relationship design ——pCTCVR=pCVR*pCTR, Is it possible to pass “ except ” Our relationship has been pCVR, namely pCVR = pCTCVR / pCTR ? For example, train one CTCVR and CTR Model , Then divide and get pCVR, In fact, it can be , But there is an obvious drawback : Predicted by the real scene pCTR、pCTCVR The values are relatively small ,“ except ” It is easy to cause numerical instability . The author compares this method in the experiment .
边栏推荐
- Breadth first search to catch cattle
- Basic use of Chrome browser
- Encapsulation of JDBC connection and disconnection database
- Is the fund reliable and safe
- Roson's QT journey 80 qurl class
- Click hijack: X-FRAME-OPTIONS is not configured
- How to handle a SIGTERM - how to handle a SIGTERM
- 个人买同业存单基金选择什么证券公司开户好,更安全
- Realization of beauty system with MATLAB
- Daily question 1: the number of numbers in the array 2
猜你喜欢

How to calculate the income tax of foreign-funded enterprises

Use and principle of handlerthread

盘点 6 月 yyds 的开源项目!

What is contemporaneous group analysis? Teach you to use SQL to handle
![[leetcode] 522. 最长特殊序列 II 暴力 + 双指针](/img/88/3ddeefaab7e29b8eeb412bb5c3e9b8.png)
[leetcode] 522. 最长特殊序列 II 暴力 + 双指针

Browser cache library design summary (localstorage/indexeddb)

Daily practice: delete duplicates in the ordered array

pinhole camera model

分析框架——用户体验度量数据体系搭建

Mask wearing face data set and mask wearing face generation method
随机推荐
旋轉接頭安裝使用注意事項
IT治理方面的七个错误,以及如何避免
Is pension insurance a financial product? Where is the expected return?
674. longest continuous increasing sequence
请问基金是否靠谱,安全吗
[MCU club] design of classroom number detection based on MCU [physical design]
Redis是什么
Reprint: VTK notes - clipping and segmentation - irregular closed loop clipping -vtkselectpolydata class (black mountain old demon)
BMFONT制作位图字体并在CocosCreator中使用
Is the fund reliable and safe
最新Justnews主题源码6.0.1开心版+社交问答插件2.3.1+附教程
Cross domain problem of canvas drawing caused by background image cache
Matrix compression
Two fresh students: one is practical and likes to work overtime, and the other is skilled. How to choose??
Es6:let, const, arrow functions
Bug risk level
JDBC连接、断开数据库的封装
PR 2021 quick start tutorial, how to use audio editing in PR?
[MCU club] design of blind water cup based on MCU [physical design]
Comparison between winding process and lamination process