当前位置:网站首页>How to identify fake reptiles?
How to identify fake reptiles?
2022-07-31 21:11:00 【oHuangBing】
When we examine website logs, we often encounter various crawlers.Some are normal crawlers, for example: search engine crawlers (Baidu search engine crawler, Google Search Engine Crawler, Bing Search Engine Crawler, YandexBot, etc.), and some crawlers with various functions, which can be viewed here: list crawlers.
However, not all crawlers on the Internet are beneficial, and some crawlers try to hide themselves, so they will learn some characteristics of real crawlers.There are also fake crawlers, that is, crawlers that fake those search engines, and will crawl the data of your website. Although the User-agent looks the same as the search engine, the IP does not belong to the search engine. At this timeWe need to accurately identify the IP addresses of these fake crawlers.
Through Crawler IP Query Tool, we can easily identify fake crawlers, for example:
34.68.229.128 Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
This is my simplified log record. The front is the IP address, and the back is the User-agent that accesses the crawler. Through the User-agent, we can see that he is a spider of the Google search engine.
By querying, we can see that this is a fake Google spider, the screenshot is as follows:

We only need to enter the IP address of the fake crawler, and we can see some information about the Crawler.In this way, whether it is true or false Li Kui (true and false reptiles) can not escape our eyes.
And if we want to see more fake bots, we can go here: listcrawlers fake bot, sort out the common fake bots on the Internet.
Summary
By introducing what is fake crawler, and how to query this tool by crawler IP, to accurately identify fake reptiles.
边栏推荐
- 返回一个零长度的数组或者空的集合,不要返回null
- 嵌入式开发没有激情了,正常吗?
- BM3 将链表中的节点每k个一组翻转
- 【论文精读】iNeRF
- find prime numbers up to n
- How can we improve the real yourself, become an excellent architect?
- Basic configuration of OSPFv3
- 每月一书(202207):《Swift编程权威指南》
- rj45 to the connector Gigabit (Fast Ethernet interface definition)
- Teach you how to deploy Nestjs projects
猜你喜欢

What's wrong with the sql syntax in my sql

idea中搜索具体的字符内容的快捷方式

广汽本田安全体验营:“危险”是最好的老师

Implementing a Simple Framework for Managing Object Information Using Reflection

利用反射实现一个管理对象信息的简单框架

Efficient Concurrency: A Detailed Explanation of Synchornized's Lock Optimization

Apache EventMesh 分布式事件驱动多运行时

Shell 脚本 快速入门到实战 -02

Realization of character makeup

Introduction to Audio Types and Encoding Formats in Unity
随机推荐
每月一书(202207):《Swift编程权威指南》
移动web开发02
Bika LIMS open source LIMS set - use of SENAITE (detection process)
Shell 脚本 快速入门到实战 -02
flowable workflow all business concepts
Flink_CDC construction and simple use
[Code Hoof Set Novice Village 600 Questions] Merge two numbers without passing a character array
Qualcomm cDSP simple programming example (to query Qualcomm cDSP usage, signature), RK3588 npu usage query
The old music player WinAmp released version 5.9 RC1: migrated to VS 2019, completely rebuilt, compatible with Win11
给定一个ip地址,子网掩码怎么算网络号(如何获取ip地址和子网掩码)
Mobile web development 02
1161. 最大层内元素和 : 层序遍历运用题
Introduction to Audio Types and Encoding Formats in Unity
renderjs usage in uni-app
Book of the Month (202207): The Definitive Guide to Swift Programming
leetcode 665. Non-decreasing Array
请问我的这段sql中sql语法哪里出了错
rj45对接头千兆(百兆以太网接口定义)
Basic Grammar Introduction of Carbon Tutorial (Tutorial)
-xms -xmx(information value)