当前位置:网站首页>UA camouflage, get and post in requests carry parameters to obtain JSON format content
UA camouflage, get and post in requests carry parameters to obtain JSON format content
2022-07-03 07:36:00 【start field】
First of all, let's learn an anti climbing strategy UA camouflage .
UA yes User-Agent( The identity of the request carrier )
Most websites have one UA Anti creep mechanism of detection , It will detect the identity of the request carrier , If it is detected that the identity of the request carrier is a browser , It indicates that the request is normal , Conversely, if the detection of identity is not browser based , That's reptile , It is likely that the server will reject the request .
UA camouflage : Let the crawler's identity disguise as a browser .
How to disguise ?
First open the browser , Right click to check or press fn and f12 Open the developer tool

Then choose the network , Select a request header , Turn down and find it User-Agent Then copy and package it into a dictionary .

headers = {
"user-agent": "Mozilla / 5.0(Windows NT 10.0;Win64;x64) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 97.0.4692.71Safari / 537.36Edg / 97.0.1072.55"
}Put it in when you make a request
page_text = requests.get(url=url,headers=headers)
Next, deal with if url How to carry parameters in .
It also encapsulates parameters into a dictionary , however get The request is assigned to params,post The request is assigned to data.
response = requests.get(url=url,params=param,headers=headers)response = requests.post(url=url,data=data,headers=headers)
Last , How to get json Formatted data
json() The object returned (obj), Only confirm that the response content is json Type of , Can be used json() Method .
How do you know if the response content is json Type? ? Or open the developer tool , Select the network , Select the request header to find content-type, You can know which type it is .
![]()
The following is the code of crawling Baidu translation
# -- coding:UTF-8 --
import json
import requests
if __name__ == "__main__":
url = 'https://fanyi.baidu.com/sug'
word = input('enter a word:')
data = {
'kw':word
}
headers = {
"user-agent": "Mozilla / 5.0(Windows NT 10.0;Win64;x64) AppleWebKit / 537.36(KHTML, likeGecko) Chrome / 97.0.4692.71Safari / 537.36Edg / 97.0.1072.55"
}
response = requests.post(url=url,data=data,headers=headers)
dic_obj = response.json()
filename = word+'.json'
with open (filename,'w',encoding='utf-8') as fp:
json.dump(dic_obj,fp=fp,ensure_ascii=False)
print('over')Use json.dump Import when json library ,dump() The role of the python The object is encoded as Json character string .
ensure_ascii: This parameter takes only Boolean values . If it is not set to true, Default output ASCLL value , If you put ensure_ascii Assigned as False, You can output Chinese .
边栏推荐
- Operation and maintenance technical support personnel have hardware maintenance experience in Hong Kong
- 专题 | 同步 异步
- 4everland: the Web3 Developer Center on IPFs has deployed more than 30000 dapps!
- Chapter VI - Containers
- Various postures of CS without online line
- Realize the reuse of components with different routing parameters and monitor the changes of routing parameters
- Longest common prefix and
- The difference between typescript let and VaR
- [mindspire paper presentation] summary of training skills in AAAI long tail problem
- 技术干货|昇思MindSpore创新模型EPP-MVSNet-高精高效的三维重建
猜你喜欢
![PdfWriter. GetInstance throws system Nullreferenceexception [en] pdfwriter GetInstance throws System. NullRef](/img/65/1f28071fc15e76abb37f1b128e1d90.jpg)
PdfWriter. GetInstance throws system Nullreferenceexception [en] pdfwriter GetInstance throws System. NullRef

Lucene introduces NFA

Technical dry goods Shengsi mindspire elementary course online: from basic concepts to practical operation, 1 hour to start!
![[set theory] Stirling subset number (Stirling subset number concept | ball model | Stirling subset number recurrence formula | binary relationship refinement relationship of division)](/img/d8/b4f39d9637c9886a8c81ca125d6944.jpg)
[set theory] Stirling subset number (Stirling subset number concept | ball model | Stirling subset number recurrence formula | binary relationship refinement relationship of division)

Introduction of transformation flow

Store WordPress media content on 4everland to complete decentralized storage

Inverted chain disk storage in Lucene (pfordelta)

图像识别与检测--笔记

Use of other streams

Partage de l'expérience du projet: mise en œuvre d'un pass optimisé pour la fusion IR de la couche mindstore
随机推荐
Summary of Arduino serial functions related to print read
An overview of IfM Engage
TreeMap
不出网上线CS的各种姿势
Pgadmin 4 v6.11 release, PostgreSQL open source graphical management tool
An overview of IfM Engage
Jeecg request URL signature
Technical dry goods | reproduce iccv2021 best paper swing transformer with Shengsi mindspire
技术干货|AI框架动静态图统一的思考
C code production YUV420 planar format file
Technical dry goods Shengsi mindspire innovation model EPP mvsnet high-precision and efficient 3D reconstruction
Circuit, packet and message exchange
Partage de l'expérience du projet: mise en œuvre d'un pass optimisé pour la fusion IR de la couche mindstore
gstreamer ffmpeg avdec解码数据流向分析
技术干货|利用昇思MindSpore复现ICCV2021 Best Paper Swin Transformer
【MySQL 14】使用DBeaver工具远程备份及恢复MySQL数据库(Linux 环境)
技术干货|关于AI Architecture未来的一些思考
Implementation of breadth first in aggregation in ES
Vertx multi vertical shared data
Hnsw introduction and some reference articles in lucene9