当前位置:网站首页>How to identify fake reptiles?
How to identify fake reptiles?
2022-07-31 21:11:00 【oHuangBing】
When we examine website logs, we often encounter various crawlers.Some are normal crawlers, for example: search engine crawlers (Baidu search engine crawler, Google Search Engine Crawler, Bing Search Engine Crawler, YandexBot, etc.), and some crawlers with various functions, which can be viewed here: list crawlers.
However, not all crawlers on the Internet are beneficial, and some crawlers try to hide themselves, so they will learn some characteristics of real crawlers.There are also fake crawlers, that is, crawlers that fake those search engines, and will crawl the data of your website. Although the User-agent looks the same as the search engine, the IP does not belong to the search engine. At this timeWe need to accurately identify the IP addresses of these fake crawlers.
Through Crawler IP Query Tool, we can easily identify fake crawlers, for example:
34.68.229.128 Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
This is my simplified log record. The front is the IP address, and the back is the User-agent that accesses the crawler. Through the User-agent, we can see that he is a spider of the Google search engine.
By querying, we can see that this is a fake Google spider, the screenshot is as follows:
We only need to enter the IP address of the fake crawler, and we can see some information about the Crawler.In this way, whether it is true or false Li Kui (true and false reptiles) can not escape our eyes.
And if we want to see more fake bots, we can go here: listcrawlers fake bot, sort out the common fake bots on the Internet.
Summary
By introducing what is fake crawler, and how to query this tool by crawler IP, to accurately identify fake reptiles.
边栏推荐
- [Code Hoof Set Novice Village 600 Questions] Merge two numbers without passing a character array
- Book of the Month (202207): The Definitive Guide to Swift Programming
- Efficient Concurrency: A Detailed Explanation of Synchornized's Lock Optimization
- Carbon教程之 基本语法入门大全 (教程)
- Made with Flutter and Firebase!counter application
- Introduction to Audio Types and Encoding Formats in Unity
- Performance optimization: remember a tree search interface optimization idea
- NVIDIA has begun testing graphics products with AD106 and AD107 GPU cores
- The principle of ReentrantLock (to be continued)
- 每月一书(202207):《Swift编程权威指南》
猜你喜欢
Shell script quick start to actual combat -02
SiC MOSFET的短路特性及保护
Basics of ResNet: Principles of Residual Blocks
【公开课预告】:超分辨率技术在视频画质增强领域的研究与应用
Cache and Database Consistency Solutions
Financial profitability and solvency indicators
Go1.18 upgrade function - Fuzz test from scratch in Go language
ReentrantLock原理(未完待续)
Chapter Six
Flink_CDC construction and simple use
随机推荐
How to change npm to Taobao mirror [easy to understand]
统计UTF-8字符串中的字符函数
How can we improve the real yourself, become an excellent architect?
leetcode 665. Non-decreasing Array
【Yugong Series】July 2022 Go Teaching Course 025-Recursive Function
MySQL---Basic select statement
[Code Hoof Set Novice Village 600 Questions] Merge two numbers without passing a character array
uni-app中的renderjs使用
架构实战营模块八作业
SiC MOSFET的短路特性及保护
程序员如何学习开源项目,这篇文章告诉你
Basic configuration of OSPFv3
Linux environment redis cluster to build "recommended collection"
【AcWing】The 62nd Weekly Match 【2022.07.30】
Count characters in UTF-8 string function
BM3 将链表中的节点每k个一组翻转
【Yugong Series】July 2022 Go Teaching Course 023-List of Go Containers
性能优化:记一次树的搜索接口优化思路
Chapter VII
Bika LIMS open source LIMS set - use of SENAITE (detection process)