当前位置:网站首页>[advertising system] incremental training & feature access / feature elimination
[advertising system] incremental training & feature access / feature elimination
2022-07-05 10:57:00 【CC‘s World】
One 、 Incremental training
Sometimes there are a lot of training data , Tens of millions are also common . Although tens of millions of people only look at the records, the number is not much , But what if there are hundreds of features , That data set is terrible , If saved as numpy.float type , That's definitely exploding the memory . I'm in this situation , Start to consider incremental training of incremental model .
On very large datasets , There are usually several ways :1. Dimensionality reduction of data ,2. Incremental training , Use streaming or similar streaming processing ,3. Big machine , High memory , Or use spark colony .
Incremental training , In fact, it has the same meaning as online learning , The typical representative of online learning is SGD Optimization of the logistics regress, Initialize parameters with data first , Update the parameters with a data on the line , Although the passage of time , The effect is getting better and better . This avoids the problem of updating the model offline .
Incremental training has two main functions , One is to find ways to use all the data , The other is to find ways to make timely use of new data . It can improve the timeliness of the model 、 Sample size and saving cluster resources .
Recommended scenarios are usually due to the introduction of a large number of ID Class characteristics lead to the existence of a large number of sparse parameters , For example, in classic YouTube DNN In the model , Use the videos watched by users and user history search tokens As the main Embedded features . According to the discussion in the paper ,YouTube DNN in candidate video as well as search tokens There are millions . On this basis, if cross features are used , It will further aggravate the problem of parameter explosion .
Low frequency scenes are recommended ID Class features will also bring the risk of over fitting to the system , In response to this question , We designed feature access / Exit mechanism strategy , It is convenient to preset the expression ability according to the specific model , Adjust the influence of low-frequency sparse parameters on the model .
Two 、 Feature access
In the business scenario , New samples will be produced all the time , New samples bring new features . Some features appear less frequently , If all are added to the model , On the one hand, it is a challenge for memory , On the other hand , Low frequency features will bring over fitting . Therefore, some characteristic access mechanisms will be formulated , Including filtering based on probability , Bloon filters, etc .
The training framework will set feature access for new features “ The threshold ” To prevent frequent access of low-frequency features . We provide two mechanisms to limit access to new features :
- Probability increases , Every time you encounter new features , Generate probability according to the preset distribution , Control feature access ;
- Use Counting Bloom Filter Count the occurrence times of new features , When the number exceeds the threshold , admittance .
The picture above briefly describes CBF Principle , Suppose the capacity is 16, Two hash Function is used as Feature ID To Index Mapping . When querying the characteristic frequency ,Feature1 after Hash Function1 and Hash Function2 Get... Separately Slot 3 and Slot 6, Two Slot Values are 1,Feature The number of occurrences can be regarded as 1.Feature2 after Hash Function1 and Hash Function2 Get... Separately Slot 6 and Slot 15. Two Slot Values, respectively 1 and 0,Feature2 The number of occurrences can be regarded as 0. That is, map to all Slot in Value minimum value .
3、 ... and 、 Feature elimination
Some features will fail if they are not updated for a long time . To relieve memory pressure , Improve the timeliness of the model , Obsolete features need to be eliminated , Make elimination rules .
For features that have been admitted , There are three ways to judge whether it is in the low-frequency state :
- Update time . If a feature has not been updated for a long time , It is considered to have been in a low-frequency state ;
- L2 norm . If a feature L2 The result of norm calculation is too small , It is considered to have been in a low-frequency state ;
- Comprehensive score of statistical value . Support user-defined functions , Through characteristic statistics ( Exposure number , clicks , Number of likes , Number of comments, etc ) To calculate the comprehensive score of features , If the score is less than the threshold, it is considered to be in a low-frequency state .
Features judged to be in a low-frequency state will be eliminated and shielded , The next time it reappears, it will be treated as a new feature .
Use feature access & after , The recommended model can generally be reduced to a quarter of the size when it is not used , Online forecasting AUC Remain flat in the thousandth .
Reference material
边栏推荐
猜你喜欢
ModuleNotFoundError: No module named ‘scrapy‘ 终极解决方式
九、磁盘管理
2022年危险化学品经营单位主要负责人特种作业证考试题库及答案
Wechat nucleic acid detection appointment applet system graduation design completion (8) graduation design thesis template
2022年化工自动化控制仪表考试试题及在线模拟考试
风控模型启用前的最后一道工序,80%的童鞋在这都踩坑
Go-3-第一个Go程序
32: Chapter 3: development of pass service: 15: Browser storage media, introduction; (cookie,Session Storage,Local Storage)
Implement the rising edge in C #, and simulate the PLC environment to verify the difference between if statement using the rising edge and not using the rising edge
Based on shengteng AI Yisa technology, it launched a full target structured solution for video images, reaching the industry-leading level
随机推荐
赛克瑞浦动力电池首台产品正式下线
Based on shengteng AI Yisa technology, it launched a full target structured solution for video images, reaching the industry-leading level
Lazy loading scheme of pictures
Broyage · fusion | savoir que le site officiel de chuangyu mobile end est en ligne et commencer le voyage de sécurité numérique!
【DNS】“Can‘t resolve host“ as non-root user, but works fine as root
iframe
Question bank and answers of special operation certificate examination for main principals of hazardous chemical business units in 2022
MFC宠物商店信息管理系统
数据库三大范式
9、 Disk management
【广告系统】增量训练 & 特征准入/特征淘汰
使用GBase 8c数据库过程中报错:80000502,Cluster:%s is busy,是怎么回事?
上拉加载原理
修复动漫1K变8K
Array
图片懒加载的方案
关于vray 5.2的使用(自研笔记)(二)
Data types ntext and varchar are incompatible in the not equal to operator - 95 small pang
Web3 Foundation grant program empowers developers to review four successful projects
DOM//