当前位置:网站首页>How to identify fake reptiles?
How to identify fake reptiles?
2022-07-31 21:11:00 【oHuangBing】
When we examine website logs, we often encounter various crawlers.Some are normal crawlers, for example: search engine crawlers (Baidu search engine crawler, Google Search Engine Crawler, Bing Search Engine Crawler, YandexBot, etc.), and some crawlers with various functions, which can be viewed here: list crawlers.
However, not all crawlers on the Internet are beneficial, and some crawlers try to hide themselves, so they will learn some characteristics of real crawlers.There are also fake crawlers, that is, crawlers that fake those search engines, and will crawl the data of your website. Although the User-agent looks the same as the search engine, the IP does not belong to the search engine. At this timeWe need to accurately identify the IP addresses of these fake crawlers.
Through Crawler IP Query Tool, we can easily identify fake crawlers, for example:
34.68.229.128 Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.121 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
This is my simplified log record. The front is the IP address, and the back is the User-agent that accesses the crawler. Through the User-agent, we can see that he is a spider of the Google search engine.
By querying, we can see that this is a fake Google spider, the screenshot is as follows:

We only need to enter the IP address of the fake crawler, and we can see some information about the Crawler.In this way, whether it is true or false Li Kui (true and false reptiles) can not escape our eyes.
And if we want to see more fake bots, we can go here: listcrawlers fake bot, sort out the common fake bots on the Internet.
Summary
By introducing what is fake crawler, and how to query this tool by crawler IP, to accurately identify fake reptiles.
边栏推荐
- Shell 脚本 快速入门到实战 -02
- 统计UTF-8字符串中的字符函数
- C language parsing json string (json object is converted to string)
- ECCV 2022 Huake & ETH propose OSFormer, the first one-stage Transformer framework for camouflaging instance segmentation!The code is open source!...
- Book of the Month (202207): The Definitive Guide to Swift Programming
- Basic Grammar Introduction of Carbon Tutorial (Tutorial)
- Daily practice——Randomly generate an integer between 1-100 and see how many times you can guess.Requirements: The number of guesses cannot exceed 7 times, and after each guess, it will prompt "bigger"
- Apache EventMesh 分布式事件驱动多运行时
- BM3 flips the nodes in the linked list in groups of k
- PCB stackup design
猜你喜欢
随机推荐
SiC MOSFET的短路特性及保护
Linux环境redis集群搭建「建议收藏」
Pytorch lstm time series prediction problem stepping on the pit "recommended collection"
Count characters in UTF-8 string function
Shell 脚本 快速入门到实战 -02
focus on!Haitai Fangyuan joins the "Personal Information Protection Self-discipline Convention"
Basic configuration of OSPFv3
Three. Introduction to js
【AcWing】第 62 场周赛 【2022.07.30】
Douyin fetches video list based on keywords API
How to change npm to Taobao mirror [easy to understand]
c语言解析json字符串(json对象转化为字符串)
【Yugong Series】July 2022 Go Teaching Course 025-Recursive Function
Memblaze released the first enterprise-grade SSD based on long-lasting particles. What is the new value behind it?
架构实战营模块 8 作业
每月一书(202207):《Swift编程权威指南》
Bika LIMS open source LIMS set - use of SENAITE (detection process)
Cache and Database Consistency Solutions
Implementation of a sequence table
有一说一,外包公司到底值不值得去?







