当前位置:网站首页>Daily challenges of search engines_ 4_ External heterogeneous resources - Zhihu
Daily challenges of search engines_ 4_ External heterogeneous resources - Zhihu
2020-11-08 07:14:00 【I don't know.】
Write it at the front :
Search engine is an extremely complex system engineering , Search engines don't work wonders , It needs a little bit of polishing . This series records daily problems , In a way that looks at leopards , A little bit to show the charm of search engines .
To the body :
The island effect of mobile ecology is becoming more and more obvious , But they have a certain relationship with each other . For general search engines , Not all the resources 、 Ecology is satisfied one by one , External resources will certainly be introduced .
Compared with Jingdong 、 Ctrip 、 Meituan and others have a large number of searches every day , But unlike general search , They search for their own ecological output , Or structured content . It doesn't have to be like a general search engine at this point , Bear this kind of " Pain ".
The main way to introduce and retrieve external resources is to provide services by exposing interfaces and cards . There are also apps that jump to provide services .
( So now every big factory is building its own ecological content , Standard formatted data , It's also easy to manage . Like the headline 、 There was no. 、 Penguin 、 Even Zhihu column .)
But when resources need to be integrated into the search engine integrated results display page , It will bring A lot of questions to think about :
1 External ways of providing , It's database building , Or request api The way . The magnitude of the database ? The magnitude of the diversion ? Can you resist . Each has its own advantages and disadvantages , Think about it first .
2 How to build a database ? It's built with its own big library ? Or build a separate library ? Both ways have their own advantages and disadvantages .
3 The fields that create the library 、 Recall 、 How to align sorted fields ? How to deal with missing fields ?
4 The way of sorting side fusion , And ecological considerations .
5 Scalability considerations , How to put the standard 、 Put in storage 、 Sorting and other levels of work can be reused as much as possible , Unify management as much as possible .
6 api How to introduce resources , In terms of its content understanding , It's almost hard to do .
6 Audit operational controls . There is no way to audit , Content is not controlled , If there is sensitivity 、 Vulgar content can have a big impact . If the way of warehousing is better ,api The way is a problem .
版权声明
本文为[I don't know.]所创,转载请带上原文链接,感谢
边栏推荐
猜你喜欢
ts流中的pcr与pts计算与逆运算
QT hybrid Python development technology: Python introduction, hybrid process and demo
Goland 编写含有template的程序
Distributed consensus mechanism
Ulab 1.0.0 release
双向LSTM在时间序列异常值检测的应用
0.计算机简史
Do you really understand the high concurrency?
Unparseable date: 'Mon Aug 15 11:24:39 CST 2016',时间格式转换异常
c# 表达式树(一)
随机推荐
Is blazor ready to serve the enterprise?
来自不同行业领域的50多个对象检测数据集
Learn Scala if Else statement
Littlest JupyterHub| 02 使用nbgitpuller分发共享文件
What? Your computer is too bad? You can handle these moves! (win10 optimization tutorial)
Got timeout reading communication packets解决方法
接口
2020-11-07:已知一个正整数数组,两个数相加等于N并且一定存在,如何找到两个数相乘最小的两个数?
Basic operation of database
SQL Server 2008R2 18456错误解决方案
C++基础知识篇:C++ 基本语法
Web Security (3) -- CSRF attack
OSChina 周日乱弹 —— 之前呢,我一直以为自己是个……
Simple use of future in Scala
Writing method of field and field comparison condition in where condition in thinkphpp6
Littlest jupyterhub| 02 using nbgitpuller to distribute shared files
C language I blog assignment 03
Astra: Apache Cassandra的未来是云原生
归纳一些比较好用的函数
Face recognition: attack types and anti spoofing techniques