当前位置:网站首页>[crawler] jsonpath for data extraction
[crawler] jsonpath for data extraction
2022-07-04 23:10:00 【Speech unrecognized】
install
pip install jsonpath
# perhaps
pip install jsonpath -i https://pypi.tuna.tsinghua.edu.cn/simple
Use
from jsonpath import jsonpath
ret = jsonpath(json_dict, 'jsonpath Syntax rule string ')
jsonpath Rule of grammar
JSONPath | describe |
---|---|
$ | Represents the root element |
@ | The current element |
. or[] | Subelement |
n/a | Parent element ,jsonpath Not supported |
.. | Regardless of location , Select the elements that meet the criteria |
* | Match all element nodes |
n/a | Access according to properties ,jsonpath Not supported , because json It's a key:value Recursive results of , No property access is required . |
[] | Iterator marks , You can do simple iterations inside , Such as array subscript 、 Select the value according to the content . |
[,] | Support iterators to make multiple choices |
?() | Support filtering operation |
() | Support expression evaluation |
n/a | grouping ,jsonpath Not supported |
give an example
from jsonpath import jsonpath
book_dict = {
"store": {
"book": [
{
"category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{
"category": "fiction",
"author": "Evelyn Waugh",
"title": "Sword of Honour",
"price": 12.99
},
{
"category": "fiction",
"author": "Herman Melville",
"title": "Moby Dick",
"isbn": "0-553-21311-3",
"price": 8.99
},
{
"category": "fiction",
"author": "J. R. R. Tolkien",
"title": "The Lord of the Rings",
"isbn": "0-395-19395-8",
"price": 22.99
}
],
"bicycle": {
"color": "red",
"price": 19.95
}
}
}
res = jsonpath(book_dict, '$.store.book[*].author ')
print(res)
# Index from 0 Start [ Indexes ]
res = jsonpath(book_dict , '$.store.book[0]')
print(res)
# Filter by criteria ?(@. Field > The number )
# .. Go directly to the selected field
res = jsonpath(book_dict, '$..book[?(@.price>10)]')
print(res)
JSONPath | describe |
---|---|
$.store.book[*].author | obtain store All of the book The author of |
$..author | Get all the authors |
$.store.* | obtain store All elements under |
$.store..price | obtain store The price of all books in |
$..book[2] | Get the third book |
$..book[(@.length-1)] | $..book[-1:] | Get the last book |
$..book[0,1]| $..book[:2] | Get the first two books |
$..book[?(@.isbn)] | The acquisition has isbn All of my books |
$..book[?(@.price>10)] | Get a price greater than 10 All my books |
$..* | Get all the data |
边栏推荐
- Redis入门完整教程:Bitmaps
- phpcms付费阅读功能支付宝支付
- mamp下缺少pcntl扩展的解决办法,Fatal error: Call to undefined function pcntl_signal()
- Editplus-- usage -- shortcut key / configuration / background color / font size
- OSEK标准ISO_17356汇总介绍
- Redis入门完整教程:HyperLogLog
- The difference between cout/cerr/clog
- qt绘制网络拓补图(连接数据库,递归函数,无限绘制,可拖动节点)
- [machine learning] handwritten digit recognition
- Question brushing guide public
猜你喜欢
Redis:Redis的事务
P2181 对角线和P1030 [NOIP2001 普及组] 求先序排列
D3.js+Three. JS data visualization 3D Earth JS special effect
Redis入门完整教程:列表讲解
为什么信息图会帮助你的SEO
实战模拟│JWT 登录认证
The difference between cout/cerr/clog
Attack and defense world misc advanced area can_ has_ stdio?
[graph theory] topological sorting
SHP data making 3dfiles white film
随机推荐
Attack and defense world misc master advanced zone 001 normal_ png
Redis入门完整教程:Pipeline
SPH中的粒子初始排列问题(两张图解决)
Redis入门完整教程:列表讲解
【剑指Offer】6-10题
Sword finger offer 65 Add without adding, subtracting, multiplying, dividing
Advantages of Alibaba cloud international CDN
OSEK标准ISO_17356汇总介绍
Redis入门完整教程:集合详解
Principle of lazy loading of pictures
刷题指南-public
A complete tutorial for getting started with redis: hyperloglog
Redis getting started complete tutorial: hash description
Analysis of the self increasing and self decreasing of C language function parameters
CTF竞赛题解之stm32逆向入门
Redis入门完整教程:API的理解和使用
Sword finger offer 68 - ii The nearest common ancestor of binary tree
Redis入门完整教程:Redis使用场景
HMS core unified scanning service
Ffmpeg quick clip