当前位置:网站首页>[crawler] jsonpath for data extraction
[crawler] jsonpath for data extraction
2022-07-04 23:10:00 【Speech unrecognized】
install
pip install jsonpath
# perhaps
pip install jsonpath -i https://pypi.tuna.tsinghua.edu.cn/simple
Use
from jsonpath import jsonpath
ret = jsonpath(json_dict, 'jsonpath Syntax rule string ')
jsonpath Rule of grammar
| JSONPath | describe |
|---|---|
$ | Represents the root element |
@ | The current element |
. or[] | Subelement |
n/a | Parent element ,jsonpath Not supported |
.. | Regardless of location , Select the elements that meet the criteria |
* | Match all element nodes |
n/a | Access according to properties ,jsonpath Not supported , because json It's a key:value Recursive results of , No property access is required . |
[] | Iterator marks , You can do simple iterations inside , Such as array subscript 、 Select the value according to the content . |
[,] | Support iterators to make multiple choices |
?() | Support filtering operation |
() | Support expression evaluation |
n/a | grouping ,jsonpath Not supported |
give an example
from jsonpath import jsonpath
book_dict = {
"store": {
"book": [
{
"category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{
"category": "fiction",
"author": "Evelyn Waugh",
"title": "Sword of Honour",
"price": 12.99
},
{
"category": "fiction",
"author": "Herman Melville",
"title": "Moby Dick",
"isbn": "0-553-21311-3",
"price": 8.99
},
{
"category": "fiction",
"author": "J. R. R. Tolkien",
"title": "The Lord of the Rings",
"isbn": "0-395-19395-8",
"price": 22.99
}
],
"bicycle": {
"color": "red",
"price": 19.95
}
}
}
res = jsonpath(book_dict, '$.store.book[*].author ')
print(res)
# Index from 0 Start [ Indexes ]
res = jsonpath(book_dict , '$.store.book[0]')
print(res)
# Filter by criteria ?(@. Field > The number )
# .. Go directly to the selected field
res = jsonpath(book_dict, '$..book[?(@.price>10)]')
print(res)
| JSONPath | describe |
|---|---|
$.store.book[*].author | obtain store All of the book The author of |
$..author | Get all the authors |
$.store.* | obtain store All elements under |
$.store..price | obtain store The price of all books in |
$..book[2] | Get the third book |
$..book[(@.length-1)] | $..book[-1:] | Get the last book |
$..book[0,1]| $..book[:2] | Get the first two books |
$..book[?(@.isbn)] | The acquisition has isbn All of my books |
$..book[?(@.price>10)] | Get a price greater than 10 All my books |
$..* | Get all the data |
边栏推荐
- Redis getting started complete tutorial: Key Management
- Redis入门完整教程:集合详解
- A complete tutorial for getting started with redis: getting to know redis for the first time
- ETCD数据库源码分析——处理Entry记录简要流程
- HMS core unified scanning service
- Redis introduction complete tutorial: detailed explanation of ordered collection
- 【机器学习】手写数字识别
- The solution to the lack of pcntl extension under MAMP, fatal error: call to undefined function pcntl_ signal()
- Redis introduction complete tutorial: List explanation
- Photoshop batch adds different numbers to different pictures
猜你喜欢

智力考验看成语猜古诗句微信小程序源码

Redis démarrer le tutoriel complet: Pipeline

Redis introduction complete tutorial: detailed explanation of ordered collection

Google Earth engine (GEE) - tasks upgrade enables run all to download all images in task types with one click

Excel shortcut keys - always add

Network namespace

Excel 快捷键-随时补充

A complete tutorial for getting started with redis: getting to know redis for the first time

Redis:Redis的事务

Redis入门完整教程:哈希说明
随机推荐
debug和release的区别
Summary of wechat applet display style knowledge points
JS 3D explosive fragment image switching JS special effect
Advanced area of attack and defense world misc 3-11
Redis入门完整教程:键管理
The difference between Max and greatest in SQL
Ffmpeg quick clip
Redis入门完整教程:初识Redis
A complete tutorial for getting started with redis: redis shell
ETCD数据库源码分析——处理Entry记录简要流程
UML图记忆技巧
微信公众号解决从自定义菜单进入的缓存问题
cout/cerr/clog的区别
Complete tutorial for getting started with redis: bitmaps
Pagoda 7.9.2 pagoda control panel bypasses mobile phone binding authentication bypasses official authentication
Redis入门完整教程:客户端通信协议
Redis入门完整教程:集合详解
List related knowledge points to be sorted out
Header file duplicate definition problem solving "c1014 error“
Redis démarrer le tutoriel complet: Pipeline