当前位置:网站首页>UA camouflage, get and post in requests carry parameters to obtain JSON format content
UA camouflage, get and post in requests carry parameters to obtain JSON format content
2022-07-03 07:36:00 【start field】
First of all, let's learn an anti climbing strategy UA camouflage .
UA yes User-Agent( The identity of the request carrier )
Most websites have one UA Anti creep mechanism of detection , It will detect the identity of the request carrier , If it is detected that the identity of the request carrier is a browser , It indicates that the request is normal , Conversely, if the detection of identity is not browser based , That's reptile , It is likely that the server will reject the request .
UA camouflage : Let the crawler's identity disguise as a browser .
How to disguise ?
First open the browser , Right click to check or press fn and f12 Open the developer tool

Then choose the network , Select a request header , Turn down and find it User-Agent Then copy and package it into a dictionary .

headers = {
"user-agent": "Mozilla / 5.0(Windows NT 10.0;Win64;x64) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 97.0.4692.71Safari / 537.36Edg / 97.0.1072.55"
}Put it in when you make a request
page_text = requests.get(url=url,headers=headers)
Next, deal with if url How to carry parameters in .
It also encapsulates parameters into a dictionary , however get The request is assigned to params,post The request is assigned to data.
response = requests.get(url=url,params=param,headers=headers)response = requests.post(url=url,data=data,headers=headers)
Last , How to get json Formatted data
json() The object returned (obj), Only confirm that the response content is json Type of , Can be used json() Method .
How do you know if the response content is json Type? ? Or open the developer tool , Select the network , Select the request header to find content-type, You can know which type it is .
![]()
The following is the code of crawling Baidu translation
# -- coding:UTF-8 --
import json
import requests
if __name__ == "__main__":
url = 'https://fanyi.baidu.com/sug'
word = input('enter a word:')
data = {
'kw':word
}
headers = {
"user-agent": "Mozilla / 5.0(Windows NT 10.0;Win64;x64) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 97.0.4692.71Safari / 537.36Edg / 97.0.1072.55"
}
response = requests.post(url=url,data=data,headers=headers)
dic_obj = response.json()
filename = word+'.json'
with open (filename,'w',encoding='utf-8') as fp:
json.dump(dic_obj,fp=fp,ensure_ascii=False)
print('over')Use json.dump Import when json library ,dump() The role of the python The object is encoded as Json character string .
ensure_ascii: This parameter takes only Boolean values . If it is not set to true, Default output ASCLL value , If you put ensure_ascii Assigned as False, You can output Chinese .
边栏推荐
- Technical dry goods | reproduce iccv2021 best paper swing transformer with Shengsi mindspire
- Use of other streams
- The concept of C language pointer
- C code production YUV420 planar format file
- 你开发数据API最快多长时间?我1分钟就足够了
- 技术干货|百行代码写BERT,昇思MindSpore能力大赏
- Epoll related references
- [set theory] Stirling subset number (Stirling subset number concept | ball model | Stirling subset number recurrence formula | binary relationship refinement relationship of division)
- 图像识别与检测--笔记
- II. D3.js draw a simple figure -- circle
猜你喜欢

带你全流程,全方位的了解属于测试的软件事故

TreeMap

Understanding of class

Comparison of advantages and disadvantages between most complete SQL and NoSQL

The concept of C language pointer

【MindSpore论文精讲】AAAI长尾问题中训练技巧的总结

技术干货|昇思MindSpore NLP模型迁移之Roberta ——情感分析任务

圖像識別與檢測--筆記

Homology policy / cross domain and cross domain solutions /web security attacks CSRF and XSS

技术干货|利用昇思MindSpore复现ICCV2021 Best Paper Swin Transformer
随机推荐
带你全流程,全方位的了解属于测试的软件事故
Longest common prefix and
List exercises after class
Lucene introduces NFA
Talk about floating
Leetcode 213: 打家劫舍 II
技术干货|利用昇思MindSpore复现ICCV2021 Best Paper Swin Transformer
Operation and maintenance technical support personnel have hardware maintenance experience in Hong Kong
項目經驗分享:實現一個昇思MindSpore 圖層 IR 融合優化 pass
[Development Notes] cloud app control on device based on smart cloud 4G adapter gc211
Leetcode 198: 打家劫舍
New stills of Lord of the rings: the ring of strength: the caster of the ring of strength appears
1. E-commerce tool cefsharp autojs MySQL Alibaba cloud react C RPA automated script, open source log
项目经验分享:基于昇思MindSpore实现手写汉字识别
[coppeliasim4.3] C calls UR5 in the remoteapi control scenario
2. E-commerce tool cefsharp autojs MySQL Alibaba cloud react C RPA automated script, open source log
gstreamer ffmpeg avdec解码数据流向分析
[set theory] Stirling subset number (Stirling subset number concept | ball model | Stirling subset number recurrence formula | binary relationship refinement relationship of division)
Summary of Arduino serial functions related to print read
Read config configuration file of vertx