当前位置:网站首页>How to identify fake reptiles?
How to identify fake reptiles?
2022-07-31 21:11:00 【oHuangBing】
When we examine website logs, we often encounter various crawlers.Some are normal crawlers, for example: search engine crawlers (Baidu search engine crawler, Google Search Engine Crawler, Bing Search Engine Crawler, YandexBot, etc.), and some crawlers with various functions, which can be viewed here: list crawlers.
However, not all crawlers on the Internet are beneficial, and some crawlers try to hide themselves, so they will learn some characteristics of real crawlers.There are also fake crawlers, that is, crawlers that fake those search engines, and will crawl the data of your website. Although the User-agent looks the same as the search engine, the IP does not belong to the search engine. At this timeWe need to accurately identify the IP addresses of these fake crawlers.
Through Crawler IP Query Tool, we can easily identify fake crawlers, for example:
34.68.229.128 Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
This is my simplified log record. The front is the IP address, and the back is the User-agent that accesses the crawler. Through the User-agent, we can see that he is a spider of the Google search engine.
By querying, we can see that this is a fake Google spider, the screenshot is as follows:
We only need to enter the IP address of the fake crawler, and we can see some information about the Crawler.In this way, whether it is true or false Li Kui (true and false reptiles) can not escape our eyes.
And if we want to see more fake bots, we can go here: listcrawlers fake bot, sort out the common fake bots on the Internet.
Summary
By introducing what is fake crawler, and how to query this tool by crawler IP, to accurately identify fake reptiles.
边栏推荐
- ThreadLocal
- AI automatic code writing plugin Copilot (co-pilot)
- Unity 之 音频类型和编码格式介绍
- OSPFv3的基本配置
- c语言解析json字符串(json对象转化为字符串)
- 架构实战营模块 8 作业
- Redis综述篇:与面试官彻夜长谈Redis缓存、持久化、淘汰机制、哨兵、集群底层原理!...
- Mobile web development 02
- Basic configuration of OSPFv3
- Pytorch lstm time series prediction problem stepping on the pit "recommended collection"
猜你喜欢
随机推荐
Thymeleaf是什么?该如何使用。
STM32 full series development firmware installation guide under Arduino framework
【愚公系列】2022年07月 Go教学课程 025-递归函数
Three.js入门
架构师04-应用服务间加密设计和实践
ThreadLocal
性能优化:记一次树的搜索接口优化思路
Transfer Learning - Domain Adaptation
Architecture Battalion Module 8 Homework
matplotlib ax bar color 设置ax bar的颜色、 透明度、label legend
Istio introduction
NVIDIA已经开始测试AD106和AD107 GPU核心的显卡产品
ResNet的基础:残差块的原理
如何才能真正的提高自己,成为一名出色的架构师?
sqlite3简单操作
Flink_CDC construction and simple use
Three. Introduction to js
【AcWing】The 62nd Weekly Match 【2022.07.30】
rj45 to the connector Gigabit (Fast Ethernet interface definition)
leetcode 665. Non-decreasing Array