当前位置:网站首页>Daily challenges of search engines_ 4_ External heterogeneous resources - Zhihu

Daily challenges of search engines_ 4_ External heterogeneous resources - Zhihu

2020-11-08 07:14:00 I don't know.

Write it at the front :

Search engine is an extremely complex system engineering , Search engines don't work wonders , It needs a little bit of polishing . This series records daily problems , In a way that looks at leopards , A little bit to show the charm of search engines .


To the body :

The island effect of mobile ecology is becoming more and more obvious , But they have a certain relationship with each other . For general search engines , Not all the resources 、 Ecology is satisfied one by one , External resources will certainly be introduced .

Compared with Jingdong 、 Ctrip 、 Meituan and others have a large number of searches every day , But unlike general search , They search for their own ecological output , Or structured content . It doesn't have to be like a general search engine at this point , Bear this kind of " Pain ".

The main way to introduce and retrieve external resources is to provide services by exposing interfaces and cards . There are also apps that jump to provide services .

( So now every big factory is building its own ecological content , Standard formatted data , It's also easy to manage . Like the headline 、 There was no. 、 Penguin 、 Even Zhihu column .)

But when resources need to be integrated into the search engine integrated results display page , It will bring A lot of questions to think about

1 External ways of providing , It's database building , Or request api The way . The magnitude of the database ? The magnitude of the diversion ? Can you resist . Each has its own advantages and disadvantages , Think about it first .

2 How to build a database ? It's built with its own big library ? Or build a separate library ? Both ways have their own advantages and disadvantages .

3 The fields that create the library 、 Recall 、 How to align sorted fields ? How to deal with missing fields ?

4 The way of sorting side fusion , And ecological considerations .

5 Scalability considerations , How to put the standard 、 Put in storage 、 Sorting and other levels of work can be reused as much as possible , Unify management as much as possible .

6 api How to introduce resources , In terms of its content understanding , It's almost hard to do .

6 Audit operational controls . There is no way to audit , Content is not controlled , If there is sensitivity 、 Vulgar content can have a big impact . If the way of warehousing is better ,api The way is a problem .

版权声明
本文为[I don't know.]所创,转载请带上原文链接,感谢