当前位置：网站首页>Multi interest recall model practice ｜ acquisition technology

Multi interest recall model practice ｜ acquisition technology

2022-06-11 01:26:00 【Acquisition technology】

writing ｜QC Get things Technology

MIND Multi interest recall in practice , Go through offline and real-time two stages to implement the final landing , The intermediate steps are therefore recorded , I hope you can gain something from reading this article .

Specific steps ： First, the feasibility is proved by the experimental data through offline recall , The confidence of the indicators at the level of days has increased ：pvctr+0.36%,uvctr+0.36,dpv+0.76%, Subsequent development based on this version MIND Online estimation method . Finally, the income obtained in the trading waterfall flow scenario is ：dpv+3.61%, Per capita collection pv+2.39%,pvctr+1.95%,uvctr+0.77%.

During the offline recall process , It's through something like I2I By the way of recall , First, train the model , Then run the single machine user embedding and item embedding Then pass faiss Get the corresponding userId Inverted recall under , In the end userId As trigger Conduct I2I The recall of . Online estimation is called by online real-time user behavior neuron Model calculations for estimating services user embedding after , And then through C Offline of the engine item embedding Calculation faiss Get recalled goods , The current number of interests is set to 3, Considering the timeout of the recalled components , Therefore, concurrent execution is required .

1. Introduction to multi interest recall

1.1 MIND Multi interest recall

How interested MIND Recall is a kind of u2i Recall , It proposes a Behavior-to-Interest (B2I) Dynamic routing , It is used to adaptively aggregate the user's behavior into an interest representation vector . Specifically , The input of the multi interest model is the user's behavior sequence , This behavior sequence uses the user's clicks in the waterfall flow scenario 、 Collection 、 To share the cspuid, The output is the user's multiple interest vectors （ This size Can define , Determined by the output dimension of the network ）, generally speaking , The difference of interest vector is concentrated in the category / Brand .
Multiple interest user vectors are a highlight of the recall , Because when users are browsing waterfall scenes , Obviously, it is impossible to have a single interest , They tend to focus on a few different interests , Maybe shoes 、 clothing 、 Cosmetics and so on . Therefore, a single interest vector is often difficult to cover Live what users want to feel .
Recall results of single interest are often limited to one category , This recall strategy has poor scalability , It is easy to push and narrow , This makes the video header effect in the candidate pool more and more obvious .

1.2 Other multi interest recall methods ：

（1） You can be interested in the long-term and short-term interests of users 、 Click with item Recall everything , This will capture all the past interests of the user , This is also currently online i2i A strategy of ;

（2） You can also use the click / Collection / To share the item They all have recalls similar to item2vec obtain item Of embedding After that faiss After the index, get topn To get all the user's interest points .

2. Multi interest recall model

The picture below is MIND The overall network block diagram of the recall ：

For models , The input of each sample can be represented as a two tuple :, among Representatives and users u A collection of interactive items , That is, the historical behavior sequence of users ;

Defined as the target item i Some of the characteristics of , For example, the brand_id And categories category_id etc. .

adopt Mulit-Interest Extractor Layer Get multiple vectors to express different aspects of user interest ;
A multi interest network with dynamic routing is proposed （MIND）, Dynamic routing is used to adaptively aggregate user historical behavior into user expression vector , To deal with different interests of users ;
Development Label-Aware Attention Tag aware attention mechanism , For learning user representations with multiple interest vectors .

Specifically, a multi interest extraction layer based on capsule network mechanism is designed , For clustering historical behavior , Extract different interests . The core idea of capsule network is “ The output is the result of some kind of clustering of the input ”.

There is an advantage here ： If all kinds of information related to users' interests are compressed into an expression vector , This will become a bottleneck for users to express their diverse interests , Recall the candidate set in the recommended recall stage , All the information about different interests of users is mixed together , Will lead to recall Item The relevance of . therefore , Multiple vectors are used to express different interests of users , Group users' historical behaviors into multiple interest capsules The process of , Expect to belong to the same capsules Correlation Item To jointly express a specific aspect of user interest .

Through multiple interest extraction layers , Multiple interest capsules from user behavior embedding establish . During training , Through a tag awareness attention layer Let the tagged items choose the used interest Capsules . Special , For each tagged item , We calculate interest capsules and label items embedding Similarity between , And calculate the weight of the interest capsule and the user representation vector as the target item , Determine the weight of a capsule of interest through corresponding compatibility , Here and DIN Medium Attention Almost the same as , but key And value Meaning is different . The target item is query, Interest capsule is keys It's also values.

among pow Define the exponential operation of each element , Is an adjustable parameter to adjust the distribution of attention . When approaching 0, Every interest capsule gets the same attention . When more than 1 when , As the size increases , The dot product with larger value will gain more and more weight . Consider the limit case , When approaching infinity , The attention mechanism becomes a kind of hard attention , Choose to focus on the maximum value and ignore other values .

2.1 user embedding Calculation part

MIND The core task of is to learn a function from user behavior sequence to multi interest vector representation , The user representation is defined as ：

among

For the user u Multi - interest vector representation of ,d by embedding Dimensions , The dimensions set by this network d and DSSM Same as 32,K Is the number of interest vectors . if K=1, Other models （ Such as Youtube DNN） Of Embedding Representation .

2.2 item embedding Calculation part

Of the target product embedding Function is ：

among

It means a Embedding&Pooling layer .

2.3 Loss function

2.4 Negative sampling

Negative adoption is the uniform sampling of several cspu_id As a negative sample , there cspu_id At that time, samples were screened out , All of them contain click behavior .

3. Implementation and deployment of multi interest recall

The implementation of multi interest recall specifically includes odps、C++ engine 、dpp and neuron Estimate the server .

3.1 odps part

among odps It mainly implements the construction of training samples 、 Model training and model off-line monitoring , The sample contains two versions , The first version uses the user behavior table on the algorithm side , The second version considers the online and offline inconsistency caused by delayed reporting in the first version , So... Was adopted dump Next click 、 Collect and share sequences , It's just... Here cspu_id As a feature , The original information of the product has not been used , For example, Taobao uses brand_id and category_id, This part will be used in subsequent iterations .

Here, we use a windowed method to build samples , The data of the past three days are used . Then it will be filtered through the pushable pool . Besides , Because of the embedding_lookup The way to search cspu_id Of embedding, Therefore, a separate sample is built in the sample cspu_id and hash_id Mapping table . Then we will do model training

3.2 C++ Engine part

C++ Engine storage cspu_id and hash_id Mapping table 、 Vector cluster storage item embedding surface . In the process of online use , Need to get user's clklist portrait , Then based on clklist The saved cspu_id Go to the front row to request C++ Mapping table acquisition caused by hash_id, And then build neuron Characteristics of . All we need here is hash_id Sequence and length of , Transferred to the neuron Then the estimate can be realized user embedding.

Get user embedding After passage dpp Build three concurrent requests to call faiss Get the recall results corresponding to each interest , Then, it is spliced and de duplicated, and then transmitted to subsequent layers .

3.2.1 Some small pits

embedding The generation time of is uncertain , Then there is no synchronization mechanism , therefore neuron Models and faiss Of emebdding Inconsistencies can lead to a decline in effectiveness .
structure embedding The tasks of are executed serially in a cluster , Then if u2i There are many recalls , It will cause the task to get stuck .

3.3 neuron The estimated part

neuron The estimation part needs to build the service , Specifically, you need to write a neuron_script Script , be used for dpp Calling neuron In the process ,neuron According to this script, you can perform feature processing , Then build the model feed, And then through a tf-servering Services pull up the model , Finally, the output user vector is returned to dpp.

4. Stability of multi interest recall model

The online service model is more than just effective , At the same time, it is necessary to ensure the stability of its operation , Therefore, the online model needs to add the corresponding offline monitoring and the corresponding blocking mechanism , And online monitoring of model services , The offline monitoring and blocking mechanism is when the model performs day level offline training , If the offline indicator ( contain auc、pcoc、loss) Not as expected , Block in time and give an alarm , As a pre block , Prevent the online effect of the road recall from being affected . The implementation of online monitoring of model services determines the recall vacancy rate 、 Recall qps、 Recall number of recall funnel and corresponding channel pvctr、uvctr And other indicators for online monitoring , And it is configured through the same cycle comparison , As a post monitor , In case of large fluctuation, it will give an alarm , Then quickly respond to rollback and processing .

Monitoring and utilization of models odps Realization , The main reason is that the model training process will loss Drop Watch , Then use the rules to monitor the long-term and short-term

4.1 Monitoring of model offline indicators

Long term monitoring Is to get the past 30 Days of 90% Quantiles and 1.1 The maximum of times , Then judge if loss Less than 90% The quantile of may be a little too fitted , Not enough interest will be blocked , If it is greater than 1.1 The maximum of times , The model may not be complete train good , The results of the recall may be too divergent , There will be some. bad case .

Short term monitoring The main thing is to calculate the past 14 Mean and variance of days , then loss To be in （ mean value -3* variance , mean value +3* variance ） Between , If not, block .

Monitored SQL Code ：

SELECT CUR_LOSS,MIN_LOSS,MAX_LOSS,IF (CUR_LOSS<MIN_LOSS or CUR_LOSS>MAX_LOSS,1,0) FROM (    (        SELECT loss AS CUR_LOSS ,1 as rn1 FROM  deal_pai_model_mind_recall_check         where ds=(select(regexp_replace(substr(date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP()),1),1,10),'-','')))    ) aLEFT JOIN     (        SELECT (avg(loss)-2*STDDEV(loss)) AS MIN_LOSS ,1 as rn2 FROM deal_pai_model_mind_recall_check         where ds<(select(regexp_replace(substr(date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP()),1),1,10),'-','')))         and ds>(select(regexp_replace(substr(date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP()),14),1,10),'-','')))    ) bon a.rn1=b.rn2LEFT JOIN     (        SELECT (avg(loss)+3*STDDEV(loss)) AS MAX_LOSS ,1 as rn3 FROM deal_pai_model_mind_recall_check         where ds<(select(regexp_replace(substr(date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP()),1),1,10),'-','')))        and ds>(select(regexp_replace(substr(date_sub(FROM_UNIXTIME(UNIX_TIMESTAMP()),14),1,10),'-','')))    ) c on a.rn1=c.rn3);

4.3 Model blocking mechanism

Fine tuning model updating is based on odps Of shell Deployment script implementation . and neuron It is to execute the deployment script regularly on the springboard machine , Determine whether the model file is generated under the path , So here you can build two paths , One is the fixed production path , One is for neuron Model file path of , If the model is blocked , That is, the model will not push To neuron The model file path of .

Here is the use of dataworks Data quality methods , In case of non-compliance with the rules through rules, it will be blocked . The block flow chart is as follows ： Official configuration file 【1】

normal Normal Follow the normal process , If abnormal, the downstream node will be blocked , Make the online model not updated to ensure model quality .

5. Summary of multi interest recall model & Post optimization points

5.1 summary

MIND The core of multi interest recall It is based on the capsule network originally applied to images and then used for recommendation , Then the dynamic routing algorithm is used to capture multiple interests of users , Corresponding multiple interests to recall different products cover What users want to be aware of , And online prediction will continuously extract the user's behavior and more accurately capture the user's subsequent interest points , This is why online estimation is much better than offline estimation .

5.2 Post optimization points

Click... On the user / Share / Add... To the collection behavior sequence cspu_id Original information , It is convenient for cold start and more product presentation information can be obtained to generate item embedding;
Optimization of engineering side , Concurrent indexes are placed in C At the engine , Used to increase recall interest ;
Optimization of negative sampling , A similar DSSM The negative sample construction method of , Instead of uniformly sampling all the clicked samples .

quote

【1】 Official configuration file https://help.aliyun.com/document_detail/73829.html?spm=a2c4g.11186623.6.1184.3d4b583az2xhTe

【2】https://arxiv.org/pdf/1904.08030.pdf

【3】CIKM2019｜MIND--- Multi interest network model in recall stage

* writing /QC

Focus on Technology , Every Monday, three or five nights 18:30 Update technology dry goods
If you think the article is helpful to you , Welcome to comment and forward some likes ～

Time limited activities ：

From now on to 6 month 30 Japan , Publicly forward any article on the technology of getting things to the circle of friends , You can go to 「 Get things Technology 」 Official account back office reply 「 Get something 」, Participate in the T-shirt raffle .

原网站

版权声明
本文为[Acquisition technology]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/162/202206110018270427.html