当前位置:网站首页>Dynamically load data
Dynamically load data
2022-07-29 07:59:00 【Zhao [email protected]】
| ajax request | json data |
|---|---|
| Dynamic loading , It will not be displayed in the web source code | json Is a data transmission format , It's essentially the object |
| Implement local update | Objects are used locally , and json It is used for data transmission |
take Python The object is encoded as JSON character string :
json.dumps(data)decode JSON object :
json.loads(jsonData)Will serialize str Save to file
json.dump(obj, fp,ensure_ascii=False)- obj: Represents the object to serialize . - fp: File descriptor , Will serialize str Save to file .json Module always generated str object , Not a byte object ; because this ,fp.write() Must support str Input . - ensure_ascii=Flase, No use ascii code , Press utf-8 codeRead from file json Format string , Turn into python object
json.load(fp)fp: File descriptor , take fp(.read() Support includes JSON Text or binary file of the document ) Deserialize to Python object .
Director of pharmaceutical Bureau
Page analysis :
- Determine whether the enterprise related data in the page is dynamically loaded ? Relevant enterprise information is dynamically loaded Through the packet capturing tool to achieve full Search for , Locate the data package corresponding to the dynamic loading data !
post:http://125.35.6.84:81/xk/itownet/portalAction.do?method=getXkzsList- The response data returned by the request is a set json strand , Through to json A simple analysis of string , No business details page found url, But find every enterprise id
- The details page of each enterprise url, The domain names are the same , The parameters are only requests id Values are different You can use the same domain name to combine different enterprises id Values are spliced into a complete enterprise details page url
- Judge whether the data in the enterprise details page is dynamically loaded ? Detect through the packet capturing tool , It is found that the enterprise details are dynamically loaded data in the details page
- Through the packet capturing tool, we can realize global search, locate and dynamically load the data packets corresponding to the data
post-url:http://125.35.6.84:81/xk/itownet/portalAction.do?method=getXkzsById
6 Request parameters :id=xxxxx The request to the json String is the enterprise detail information data we finally want
# -*- coding = utf-8 -*-
#@time :2020/5/17 18:22
#@file General Administration of Drug Administration .py
#@Software: PyCharm
import requests
from fake_useragent import UserAgent
import json
''' Through analysis ‘ It is found that this web page is a dynamically loaded web page , Right click the source code, there is nothing in it , It's a ajx Dynamic request for , from XHR Find post request url '''
if __name__ == '__main__':
IDlist = [] # All enterprises ID
infolist = []
post_url = "http://125.35.6.84:81/xk/itownet/portalAction.do?method=getXkzsList"
print(" Start ".center(10, "*"))
start_page = int(input(" Please enter the starting page number :"))
end_page = int(input(" Please enter the end page number ;"))
for page in range(start_page,end_page+1):
print(" The first %s Starting page -"%page)
fromdata={
"on":"true",
"page":page,
"pageSize":"15",
"productName":"",
"conditionType":"1",
"applyname":"",
"applysn":"",
}
r =requests.post(url=post_url,headers={
"User-Agent":UserAgent().chrome},data=fromdata)
# obtain json data
data_json = r.json()
# Traverse , obtain ID value , take data——json Response information , stay json Found in online parsing , It's a dictionary , Desired ID In this list Inside
for data in data_json["list"]:
IDlist.append(data["ID"])
# For details page post URL
homepage_post = "http://125.35.6.84:81/xk/itownet/portalAction.do?method=getXkzsById"
# Traverse the list , Package once per cycle
for id in IDlist:
data ={
"id":id,
}
r =requests.post(url=homepage_post,headers={
"User-Agent":UserAgent().random},data=data)
info =r.json()
# print(info)
infolist.append(info)
# Persistent storage
fp =open("infodata.json","w",encoding="utf8")
json.dump(infolist,fp=fp,ensure_ascii=False)
print(" end ".center(10,"*"))
版权声明
本文为[Zhao [email protected]]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/210/202207290520358054.html
边栏推荐
- UPC little C's King Canyon
- An optimal buffer management scheme with dynamic thresholds paper summary
- Why don't you like it? It's easy to send email in cicd
- Detailed explanation of the find command (the most common operation of operation and maintenance at the end of the article)
- Sort out the two NFT pricing paradigms and four solutions on the market
- Prepare esp32 environment
- For the application challenge of smart city, shengteng AI gives a new solution
- Exercise: store department information
- MySQL 45 | 08 is the transaction isolated or not?
- 智慧城市的应用挑战,昇腾AI给出了新解法
猜你喜欢

NLP introduction + practice: Chapter 5: using the API in pytorch to realize linear regression

Unity beginner 4 - frame animation and protagonist attack (2D)

Convert source package to RPM package

Up sampling deconvolution operation
![[paper reading] tomoalign: a novel approach to correcting sample motion and 3D CTF in cryoet](/img/3a/75c211f21758ca2d9bb1a40d739d80.png)
[paper reading] tomoalign: a novel approach to correcting sample motion and 3D CTF in cryoet

MySQL 45 | 08 is the transaction isolated or not?

Ionicons icon Encyclopedia

@Use of jsonserialize annotation

Effective learning of medical image segmentation annotation based on noise pseudo tags and adversarial learning

Research on autojs wechat: the final product of wechat automatic information sending robot (effective demonstration)
随机推荐
Greenplus enterprise deployment
UPC little C's King Canyon
[cryptography experiment] 0x00 install NTL Library
Embroidery of little D
Pytest set (7) - parameterization
String class
Implementation of simple cubecap+fresnel shader in unity
What are the principles and methods of implementing functional automation testing?
Shell script - global variables, local variables, environment variables
Monitor the bottom button of page scrolling position positioning (including the solution that page initialization positioning does not take effect on mouse sliding)
《nlp入门+实战:第五章:使用pytorch中的API实现线性回归》
@JsonSerialize注解的使用
The smallest positive number that a subset of an array cannot accumulate
Tcp/ip five layer reference model and corresponding typical devices and IPv6
Cfdiv1+2-bash and a high math puzzle- (gcd+ summary of segment tree single point interval maintenance)
[cryoelectron microscope | paper reading] interpretation of sub fault average m software: multi particle cryo EM refining with M
Day 014 2D array exercise
Actual measurement of boot and pH pins of buck circuit
IonIcons图标大全
The database uses PSQL and JDBC to connect remotely and disconnect automatically from time to time