当前位置:网站首页>Reptile: from introduction to imprisonment (I) -- Concept
Reptile: from introduction to imprisonment (I) -- Concept
2022-07-28 14:51:00 【Youth is short!】
What is a reptile ?
I believe you have heard of it for a long time , But I don't know what it is , Let's next define !
Web crawler technology :
By programming , Analog browser internet , Then let the program automatically grab data on the Internet according to the set rules
The value of reptiles :
You can grab a lot of useful data from the Internet , For my use , Reuse data analysis , Generate commercialized or productized value
Employment Perspective :
With the advent of the big data era , High salary and large personnel gap
Demonstration of the legitimacy of reptiles
Crawler technology is a tool , It's neutral , Therefore, it is not prohibited in the law
The behavior with illegal risk is illegal
So reptiles are divided into : Friendly reptiles and malicious reptiles
Malicious reptiles :
1. The crawler interferes with the normal operation of the visited website
2. The act of crawling through legally protected data
Fooling around is easy to enter “ a mandarin orange ”!!!!
Pay attention to optimizing your code or logic at any time , Avoid interfering with the operation of the website as the website is updated
Please review the data you crawl in time , If the data violates ( privacy , Business sensitivity or other things that cannot be said ), Be sure to delete !!!
Classification of reptiles in use scenarios
- Universal crawler :
Grab an important part of the system . Grabbing the whole page data .
- Focus on reptiles :
It's based on a universal reptile . What we grab is the specific partial content of the page .
- Incremental reptiles :
Detect data updates in the website . It will only grab the latest updated data in the website .
There must be a reptile reaction mechanism !
Portal site , Can pass Formulate corresponding strategies or technical means , Prevent crawlers from crawling the website data
Of course , The law is strong , And anti Anti climbing technology , Ha ha ha ( Always competing )
Anti-crawl strategy :
Crawler programs can be developed through relevant strategies or technical means , Crack the anti crawling mechanism in portal website , So you can get the relevant data in the portal website .
边栏推荐
- Iterator iterator interface
- Getting started with scottplot tutorial: getting and displaying values at the mouse
- Penguin side: why not recommend using select *?
- Qt中QTableView设置分页显示的三种方法[通俗易懂]
- 爆肝整理JVM十大模块知识点总结,不信你还不懂
- 10、 Timestamp
- Added the ability of class @published for @cloudstorage
- 9、 Uni popup usage popup effect at the bottom of the drop-down box
- [leetcode] sticker spelling (dynamic planning)
- 8、 Picker usage drop-down box selection effect
猜你喜欢

Summarize the knowledge points of the ten JVM modules. If you don't believe it, you still don't understand it

Tdengine helps Siemens' lightweight digital solutions

When Xcode writes swiftui code, it is a small trap that compiles successfully but causes the preview to crash

国产数据库的红利还能“吃”多久?

Redis-Redis在Jedis中的使用

数字化转型安全问题频发,山石网科助力数字政府建设

How long can we "eat" the dividends of domestic databases?

Xcode编写SwiftUI代码时一个编译通过但导致预览(Preview)崩溃的小陷阱

ScottPlot入门教程:获取和显示鼠标处的数值

Redis-配置文件讲解
随机推荐
实时切换 Core Data 的云同步状态
Node文件操作
468产品策划与推广方案(150份)
The second pre class exercise
为自定义属性包装类型添加类 @Published 的能力
468 product planning and promotion plan (150 copies)
多所“双一流”大学,保研预报名启动!
Pointers and arrays (7)
Hcip day 11
unittest执行runTestCase提示<_io.TextIOWrapper name=‘<stderr>‘ mode=‘w‘ encoding=‘utf-8‘>解决方案
Error reason for converting string to long type: to convert to long type, it must be int, double, float type [easy to understand]
数字化转型安全问题频发,山石网科助力数字政府建设
Excel VBA 开发过程中遇到的一些问题,解决方案,持续更新
Floating point data type in C language (did you learn to waste it)
Swiftui layout - alignment
如何只降3D相机不降UI相机的分辨率
工厂模式和构造函数模式
爆肝整理JVM十大模块知识点总结,不信你还不懂
58 sub station Anju, broker marketing management platform login interface encryption reverse
[ecmascript6] modularization