当前位置:网站首页>UA camouflage, get and post in requests carry parameters to obtain JSON format content
UA camouflage, get and post in requests carry parameters to obtain JSON format content
2022-07-03 07:36:00 【start field】
First of all, let's learn an anti climbing strategy UA camouflage .
UA yes User-Agent( The identity of the request carrier )
Most websites have one UA Anti creep mechanism of detection , It will detect the identity of the request carrier , If it is detected that the identity of the request carrier is a browser , It indicates that the request is normal , Conversely, if the detection of identity is not browser based , That's reptile , It is likely that the server will reject the request .
UA camouflage : Let the crawler's identity disguise as a browser .
How to disguise ?
First open the browser , Right click to check or press fn and f12 Open the developer tool
Then choose the network , Select a request header , Turn down and find it User-Agent Then copy and package it into a dictionary .
headers = { "user-agent": "Mozilla / 5.0(Windows NT 10.0;Win64;x64) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 97.0.4692.71Safari / 537.36Edg / 97.0.1072.55" }
Put it in when you make a request
page_text = requests.get(url=url,headers=headers)
Next, deal with if url How to carry parameters in .
It also encapsulates parameters into a dictionary , however get The request is assigned to params,post The request is assigned to data.
response = requests.get(url=url,params=param,headers=headers)response = requests.post(url=url,data=data,headers=headers)
Last , How to get json Formatted data
json() The object returned (obj), Only confirm that the response content is json Type of , Can be used json() Method .
How do you know if the response content is json Type? ? Or open the developer tool , Select the network , Select the request header to find content-type, You can know which type it is .
The following is the code of crawling Baidu translation
# -- coding:UTF-8 --
import json
import requests
if __name__ == "__main__":
url = 'https://fanyi.baidu.com/sug'
word = input('enter a word:')
data = {
'kw':word
}
headers = {
"user-agent": "Mozilla / 5.0(Windows NT 10.0;Win64;x64) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 97.0.4692.71Safari / 537.36Edg / 97.0.1072.55"
}
response = requests.post(url=url,data=data,headers=headers)
dic_obj = response.json()
filename = word+'.json'
with open (filename,'w',encoding='utf-8') as fp:
json.dump(dic_obj,fp=fp,ensure_ascii=False)
print('over')
Use json.dump Import when json library ,dump() The role of the python The object is encoded as Json character string .
ensure_ascii: This parameter takes only Boolean values . If it is not set to true, Default output ASCLL value , If you put ensure_ascii Assigned as False, You can output Chinese .
边栏推荐
- Lombok cooperates with @slf4j and logback to realize logging
- Partage de l'expérience du projet: mise en œuvre d'un pass optimisé pour la fusion IR de la couche mindstore
- Reconnaissance et détection d'images - Notes
- Leetcode 213: looting II
- Circuit, packet and message exchange
- Homology policy / cross domain and cross domain solutions /web security attacks CSRF and XSS
- New stills of Lord of the rings: the ring of strength: the caster of the ring of strength appears
- Download address collection of various versions of devaexpress
- Chapter VI - Containers
- 《指環王:力量之戒》新劇照 力量之戒鑄造者亮相
猜你喜欢
Inverted chain disk storage in Lucene (pfordelta)
你开发数据API最快多长时间?我1分钟就足够了
Spa single page application
[mindspire paper presentation] summary of training skills in AAAI long tail problem
技术干货|利用昇思MindSpore复现ICCV2021 Best Paper Swin Transformer
【CoppeliaSim4.3】C#调用 remoteApi控制场景中UR5
Hnsw introduction and some reference articles in lucene9
技术干货|昇思MindSpore NLP模型迁移之LUKE模型——阅读理解任务
The embodiment of generics in inheritance and wildcards
Reconnaissance et détection d'images - Notes
随机推荐
Lucene introduces NFA
Paper learning -- Study on the similarity of water level time series of Xingzi station in Poyang Lake
sharepoint 2007 versions
【MindSpore论文精讲】AAAI长尾问题中训练技巧的总结
Analysis of the eighth Blue Bridge Cup single chip microcomputer provincial competition
Custom generic structure
[mindspire paper presentation] summary of training skills in AAAI long tail problem
Technical dry goods | reproduce iccv2021 best paper swing transformer with Shengsi mindspire
Common analysis with criteria method
圖像識別與檢測--筆記
Traversal in Lucene
技术干货|关于AI Architecture未来的一些思考
项目经验分享:基于昇思MindSpore,使用DFCNN和CTC损失函数的声学模型实现
SQL create temporary table
pgAdmin 4 v6.11 发布,PostgreSQL 开源图形化管理工具
Vertx's responsive redis client
Warehouse database fields_ Summary of SQL problems in kingbase8 migration of Jincang database
技术干货|昇思MindSpore创新模型EPP-MVSNet-高精高效的三维重建
Realize the reuse of components with different routing parameters and monitor the changes of routing parameters
Es writing fragment process