当前位置:网站首页>New function | intelligent open search online customized word weight model
New function | intelligent open search online customized word weight model
2022-07-26 12:39:00 【Alibaba cloud big data AI technology】
Catalog
Pain points 1: Strong industry attribute
Pain points 2: It is difficult to build a model by oneself 、 The high cost 、 Cycle is long
Open search lightweight customized word weight solution
E-commerce Scene Effect Comparison
Content Scene Effect Comparison
Business pain points
In the query analysis stage of search , Word weight will analyze the importance of each word in the text , And quantify it into weights , Words with low weight may not participate in search recall . Use word weight model , It can avoid when the query words entered by the user contain some words with low weight , Recall is still limited by the original input , Cause too few hits . therefore , Word weight can effectively improve the recall rate of search , It is an indispensable function in the search, query and analysis stage .
Pain points 1: Strong industry attribute
Customers' search applications often belong to specific industries , When the general word weight model is applied in a specific industry, due to the lack of industry knowledge , There will be more badcase Affect the recall sorting effect .
Based on the industry model, the adaptation problem of word weight model in customer search application can be alleviated , However, if the customer's data distribution belongs to a specific vertical category in the industry or there is no corresponding industry, it can be linked , You need to customize the word weight based on customer data to achieve the best effect .
Pain points 2: It is difficult to build a model by oneself 、 The high cost 、 Cycle is long
The self built word weight model mainly includes the following processes :

difficulty 1: Word weight tagging requires high domain knowledge , It is difficult to judge the importance of different words in search engines . At the same time, the amount of data also needs to reach at least 10000 , It can take months .
difficulty 2: The threshold of model training is high , Need professional algorithm practitioners to debug , And the model effect 、 Iteration efficiency strongly depends on the investment and ability of Algorithm Engineers .
difficulty 3: Model deployment 、 The operation and maintenance process is complex , Engineering required 、 Algorithm 、 Operation and maintenance and other parties participate , And the launch of the depth model also involves performance 、 Many optimizations related to efficiency .
Open search lightweight customized word weight solution
Introduction of the plan
Before the search text recall , Open search will analyze and process the query semantics of the keywords entered by users . Due to the diversity of business scenarios , Different industries and businesses have their own particularity , Only the word weight model specific to the application level can guarantee the optimal search effect .
Compared with participle , Word weight plays a role in both recall and relevance ranking . In the recall phase , The word weight model will give each search word a corresponding weight score . First query , Words with low weight do not require a hit in the query, but will participate in the calculation . If you query zero or less results for the first time , The second query will lower 、 Words with medium weight are not mandatory to have a hit , So as to expand the recall . Correlation sorting stage , The weight of each search term given by the word weight model will participate in the calculation of correlation characteristics . When there are hits , Documents with high weight words will get higher scores , Thus ranking higher . The customized word weight model is customized based on the customer's own business data , Greatly improve search results .
OpenSearch It provides rich domain specific word weight models , Users can base on the corresponding industry analyzer , After simple configuration training, we get a customized word weight model . After training , Users can view the variance rate in the console 、 Typical word weight case Compare and other model effects , Wait until the effect meets expectations , This customized word weight model can be used in open search , And support the manual intervention of word weight effect .
The whole customization process does not require additional data docking , Word weight model training will automatically extract existing data for adaptation .

For customers
- Search is an important scenario of core business , Customers who have higher requirements for search results
- industry 、 Pendula 、 Special business , Customers with more exclusive terms
- Search manpower is limited , Algorithm students have relatively few customers
Usage method
- Import business data
- Create word weight model , Select the training field and create the model
- After model training , Quote word weight model in query analysis
For more instructions, please refer to : Recall custom word weights - Intelligent open search OpenSearch - Alibaba cloud
Effect comparison
E-commerce Scene Effect Comparison
original text | Weight of e-commerce common words | Customize word weights |
Four piece gift set in winter | winter : in Four piece suit : high gift : high | winter : in Four piece suit : high gift : low |
Bear digital display warm hand treasure | Little bear : low Digital display : in Hand warmer : high | Little bear : in Digital display : low Hand warmer : high |
Defibrillator compatibility package | Defibrillator : high compatible : in package : low | Defibrillator : in compatible : low package : high |
22 Annual calendar | 22: low year : low wall calendar : high | 22: in year : low wall calendar : high |
Content Scene Effect Comparison
original text | Common word weight | Customize word weights |
Potential function | potential : low function : high | potential : high function : low |
get post difference | get: in post: high difference : low | get: high post: high difference : low |
ktv Song ordering system | ktv: in choose a song : high System : in | ktv: high choose a song : in System : low |
Summary :
- If your business is currently using or preparing to use the industry version of open search , You can train the customized word weight model based on the industry model ;
- If open search has not yet provided an industry version close to your business , It is suggested to train the customized word weight model based on the general version model , This situation requires as much data as possible , The distribution shall be as comprehensive and balanced as possible , Help to improve the effect of the model ;
- Open search currently supports Custom word breaker 、 Custom word weight model , More customized recall models will be provided later , Coming soon ~
边栏推荐
- 什么是回调函数,对于“回”字的理解
- Detailed interpretation of hole convolution (input and output size analysis)
- Shell变量和引用
- What is the Internet of things? The most comprehensive explanation of common IOT protocols
- Customize browser default right-click menu bar
- 数据库组成表
- 字节流习题遇到的问题及解决方法
- STM32 drives hc05 Bluetooth serial port communication module
- Ds-24c/dc220v time relay
- Digital intelligence transformation, management first | jnpf strives to build a "full life cycle management" platform
猜你喜欢

扫雷小游戏——轻松玩上瘾(C语言版)

海外APP推送(下篇):海外厂商通道集成指南

Use the jsonobject object in fastjason to simplify post request parameter passing

Detailed interpretation of hole convolution (input and output size analysis)
![[wechat applet] read the article, data request](/img/9a/3b9aef6c5f5735b886252ec830798c.png)
[wechat applet] read the article, data request

Pytoch deep learning quick start tutorial -- mound tutorial notes (I)

Minesweeping games - easy to play addictive (C language version)

三维点云课程(八)——特征点匹配

Map函数统计字符出现的次数

什么是回调函数,对于“回”字的理解
随机推荐
PXE principle and configuration
3D point cloud course (VIII) -- feature point matching
华为超融合FusionCube解决方案笔记
Redis主从复制原理
Vs code set the method of ctrl+s saving and automatic formatting
二、容器_
食品安全 | 无菌蛋真的完全无菌吗?
如何以文本形式查看加密过的信息
yolov7训练危险品识别 pytorch
The difference between JVM memory overflow and memory leak
LCD notes (4) analyze the LCD driver of the kernel
The.Net webapi uses groupname to group controllers to render the swagger UI
LCD笔记(6)LCD驱动程序框架_配置引脚
Pytoch deep learning quick start tutorial -- mound tutorial notes (I)
羽毛球馆的两个基础设施你了解多少?
Use of strjoin function in MATLAB
基于STM32的SIM900A发送中文和英文短信
Overseas app push (Part 2): Channel Integration Guide for overseas manufacturers
HTAP是有代价的
数字化时代,是什么“黄金宝藏”在推动百年药企发展?