当前位置:网站首页>How to identify fake reptiles?
How to identify fake reptiles?
2022-07-31 21:11:00 【oHuangBing】
When we examine website logs, we often encounter various crawlers.Some are normal crawlers, for example: search engine crawlers (Baidu search engine crawler, Google Search Engine Crawler, Bing Search Engine Crawler, YandexBot, etc.), and some crawlers with various functions, which can be viewed here: list crawlers.
However, not all crawlers on the Internet are beneficial, and some crawlers try to hide themselves, so they will learn some characteristics of real crawlers.There are also fake crawlers, that is, crawlers that fake those search engines, and will crawl the data of your website. Although the User-agent looks the same as the search engine, the IP does not belong to the search engine. At this timeWe need to accurately identify the IP addresses of these fake crawlers.
Through Crawler IP Query Tool, we can easily identify fake crawlers, for example:
34.68.229.128 Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
This is my simplified log record. The front is the IP address, and the back is the User-agent that accesses the crawler. Through the User-agent, we can see that he is a spider of the Google search engine.
By querying, we can see that this is a fake Google spider, the screenshot is as follows:
We only need to enter the IP address of the fake crawler, and we can see some information about the Crawler.In this way, whether it is true or false Li Kui (true and false reptiles) can not escape our eyes.
And if we want to see more fake bots, we can go here: listcrawlers fake bot, sort out the common fake bots on the Internet.
Summary
By introducing what is fake crawler, and how to query this tool by crawler IP, to accurately identify fake reptiles.
边栏推荐
- 移动web开发02
- Realize serial port receiving data based on STM32 ring queue
- BM5 merge k sorted linked lists
- grep命令 笔试题
- BM3 flips the nodes in the linked list in groups of k
- Shell script quick start to actual combat -02
- SiC MOSFET的短路特性及保护
- 返回一个零长度的数组或者空的集合,不要返回null
- matplotlib ax bar color 设置ax bar的颜色、 透明度、label legend
- Implementation of a sequence table
猜你喜欢
ReentrantLock原理(未完待续)
Memblaze发布首款基于长存颗粒的企业级SSD,背后有何新价值?
Made with Flutter and Firebase!counter application
Shell script quick start to actual combat -02
Redis Overview: Talk to the interviewer all night long about Redis caching, persistence, elimination mechanism, sentinel, and the underlying principles of clusters!...
角色妆容的实现
Basic configuration of OSPFv3
Count characters in UTF-8 string function
Cache and Database Consistency Solutions
【PIMF】OpenHarmony 啃论文俱乐部—盘点开源鸿蒙三方库【3】
随机推荐
BM5 merge k sorted linked lists
Financial profitability and solvency indicators
Three. Introduction to js
grep command written test questions
sqlite3简单操作
cas and spin locks (is lightweight locks spin locks)
How can we improve the real yourself, become an excellent architect?
leetcode:6135. 图中的最长环【内向基环树 + 最长环板子 + 时间戳】
Shell script quick start to actual combat -02
【Yugong Series】July 2022 Go Teaching Course 025-Recursive Function
深度学习中的batch(batch size,full batch,mini batch, online learning)、iterations与epoch
NVIDIA已经开始测试AD106和AD107 GPU核心的显卡产品
Implementation of a sequence table
Bika LIMS open source LIMS set - use of SENAITE (detection process)
c语言解析json字符串(json对象转化为字符串)
Efficient Concurrency: A Detailed Explanation of Synchornized's Lock Optimization
Tkinter 入门之旅
How to change npm to Taobao mirror [easy to understand]
Carbon教程之 基本语法入门大全 (教程)
Apache EventMesh distributed event-driven multi-runtime