当前位置:网站首页>Restoration analysis of protobuf protocol of bullet screen in station B
Restoration analysis of protobuf protocol of bullet screen in station B
2022-07-06 23:22:00 【VIP_ CQCRE】
This is a 「 Attacking Coder」 Of the 657 Technology sharing
author :TheWeiJun
source : The story of reverse and reptile
“
It is necessary to read this article 3 minute .
”Catalog
One 、 What is? protobuf?
Two 、 Website debugging analysis
3、 ... and 、protobuf Protocol restore
Four 、 Complete code implementation
5、 ... and 、 Experience sharing and summary
Interesting module
Xiao Hong is a data analysis engineer , Since last time Xiaohong solved the problem of font anti crawling , Xiao Hong has not encountered any difficult problems . But there's something unexpected , Today, when Xiaohong was analyzing the bullet screen King , There is a new problem . The data is garbled and irregular , Is said to be protobuf, Today, let's analyze the new problems encountered by Xiao Hong !
One 、 What is? protobuf agreement ?
Preface :Protobuf (Protocol Buffers) It is an unrelated platform developed by Google , No language , Scalable , Lightweight and efficient data format of serialization structure , Used to sequence custom data structures into byte streams , And deserializing byte streams into data structures . So it is very suitable for data storage and for different languages , Data exchange format for communication between different applications , As long as the same protocol format is implemented , The suffix is proto Files are compiled into different languages , Join their respective projects , In this way, different languages can parse other languages through Protobuf Serialized data . Currently officially provided c++,java,go Language support .
Two 、 Website debugging analysis
1、 First, open our website for this analysis , Search the content of the specified barrage , The screenshot is as follows :
explain : Because the bullet screen content uses protobuf agreement , So you can't search and locate directly , We need to analyze packet requests , To locate specific url link .
2、 Analyze packet requests , Navigate to the barrage link , The screenshot is as follows :
explain : We can clearly see from the screenshot , This is the content of the barrage . But after all, I used protobuf Protocol code , If we want to restore the plaintext information , Next, we need to go JS Breakpoint debugging analysis .
3、 Use xhr/fetch Debug the request breakpoint , The screenshot is as follows :
explain : Because the request is right response the protobuf Protocol code , So after we locate the location of the request for contract , Just pay attention to the following decoding logic .
4、 After executing the breakpoint operation button , The screenshot is as follows :
explain : At the moment r The variable is the barrage we want to access url Address ; Next, continue to execute the breakpoint .
5、 Continue to execute breakpoints , Continue closer , The screenshot is as follows :
Now we print variables r Value , The screenshot is as follows :
explain : This is the plaintext information we want ? Next , We just need to find protobuf Protocol initialization parameters id The definition can restore the plaintext .
6、 after JS Breakpoint debugging , Finally, it is oriented to protobuf The protocol initialization parameters are as follows :
7、 take Console After copying the data in JSON Online formatting and parsing , The screenshot is as follows :
summary : know response Plaintext and protobuf Protocol defined parameters and id after , Next we just need to build proto File can complete the restoration of the entire plaintext information .
3、 ... and 、protobuf Protocol restore
1、 Restore protobuf agreement , Edit the code structure as follows :
2、 Execute the following command , Compiled into python protobuf Executable file :
protoc --python_out=. *.proto
3、 After running the command , Generate protobuf file , The screenshot is as follows :
summary : Come here protobuf The agreement is completely restored , Next, let's enter the complete code implementation .
Four 、 Complete code implementation
1、 The complete code of the whole project is as follows
# -*- coding: utf-8 -*-
# --------------------------------------
# @author : official account : The story of reverse and reptile
# --------------------------------------
import requests
from feed_pb2 import Feed
from google.protobuf.json_format import MessageToDict
def start_requests():
cookies = {
'rpdid': '|(J~RkYYY|k|0J\'uYulYRlJl)',
'buvid3': '794669E2-CEBC-4737-AB8F-73CB9D9C0088184988infoc',
'buvid4': '046D34538-767A-526A-8625-7D1F04E0183673538-022021413-+yHNrXw7i70NUnsrLeJd2Q%3D%3D',
'DedeUserID': '481849275',
'DedeUserID__ckMd5': '04771b27fae39420',
'sid': 'ij1go1j8',
'i-wanna-go-back': '-1',
'b_ut': '5',
'CURRENT_BLACKGAP': '0',
'buvid_fp_plain': 'undefined',
'blackside_state': '0',
'nostalgia_conf': '-1',
'PVID': '2',
'b_lsid': '55BA153F_18190A78A34',
'bsource': 'search_baidu',
'innersign': '1',
'CURRENT_FNVAL': '4048',
'b_timer': '%7B%22ffp%22%3A%7B%22333.1007.fp.risk_794669E2%22%3A%2218190A78B5F%22%2C%22333.788.fp.risk_794669E2%22%3A%2218190A797FF%22%2C%22333.42.fp.risk_794669E2%22%3A%2218190A7A6C5%22%7D%7D',
}
headers = {
'authority': 'xxxxxx',
'accept': '*/*',
'accept-language': 'zh-CN,zh;q=0.9',
'cache-control': 'no-cache',
'origin': 'https://www.xxxxx.com',
'pragma': 'no-cache',
'referer': 'https://www.xxxxxx.li.com/video/BV1434y1L7rb?spm_id_from=333.851.b_7265636f6d6d656e64.1&vd_source=8d45ec9ed78652f966b3625afe95e904',
'sec-ch-ua': '".Not/A)Brand";v="99", "Google Chrome";v="103", "Chromium";v="103"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"macOS"',
'sec-fetch-dest': 'empty',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-site',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
}
params = {
'type': '1',
'oid': '729126061',
'pid': '896926231',
'segment_index': '1',
}
response = requests.get('https://xxxx.xxxx.com/x/v2/dm/web/seg.so', params=params, cookies=cookies,
headers=headers)
info = Feed()
info.ParseFromString(response.content)
_data = MessageToDict(info, preserving_proto_field_name=True)
messages = _data.get("message") or []
for message in messages:
print(message.get("content"))
if __name__ == '__main__':
start_requests()
2、 After running the code , The screenshot is as follows :
5、 ... and 、 Experience sharing and summary
Review the entire analysis process , The difficulties are summarized as follows :
How to quickly locate the location of encryption parameters
Understand and master protobuf agreement
It can be restored through the source code proto file
How to be in python Use in protobuf
End
Cui Qingcai's new book 《Python3 Web crawler development practice ( The second edition )》 It's officially on the market ! The book details the use of zero basis Python Develop all aspects of reptile knowledge , At the same time, compared with the first edition, it has added JavaScript reverse 、Android reverse 、 Asynchronous crawler 、 Deep learning 、Kubernetes Related content , At the same time, this book has obtained Python The father of Guido The recommendation of , At present, this book is on sale at a 20% discount !
Content introduction :《Python3 Web crawler development practice ( The second edition )》 Content introduction
Scan purchase
You'd better watch it
边栏推荐
- #DAYU200体验官# 首页aito视频&Canvas绘制仪表盘(ets)
- How to choose the server system
- Some suggestions for foreign lead2022 in the second half of the year
- docker mysql5.7如何设置不区分大小写
- DR-Net: dual-rotation network with feature map enhancement for medical image segmentation
- 每日刷题记录 (十五)
- dockermysql修改root账号密码并赋予权限
- 实现多彩线条摆出心形
- Face recognition class attendance system based on paddlepaddle platform (easydl)
- Docker mysql5.7 how to set case insensitive
猜你喜欢
CUDA exploration
浅谈网络安全之文件上传
On file uploading of network security
MySQL authentication bypass vulnerability (cve-2012-2122)
借助这个宝藏神器,我成为全栈了
GPT-3当一作自己研究自己,已投稿,在线蹲一个同行评议
Today's sleep quality record 78 points
Coscon'22 community convening order is coming! Open the world, invite all communities to embrace open source and open a new world~
Hard core observation 545 50 years ago, Apollo 15 made a feather landing experiment on the moon
MySQL中正则表达式(REGEXP)使用详解
随机推荐
js对JSON数组的增删改查
Designed for decision tree, the National University of Singapore and Tsinghua University jointly proposed a fast and safe federal learning system
Les entreprises ne veulent pas remplacer un système vieux de dix ans
The same job has two sources, and the same link has different database accounts. Why is the database list found in the second link the first account
dockermysql修改root账号密码并赋予权限
Up to 5million per person per year! Choose people instead of projects, focus on basic scientific research, and scientists dominate the "new cornerstone" funded by Tencent to start the application
这个『根据 op 值判断操作类型来自己组装 sql』是指在哪里实现?是指单纯用 Flink Tabl
mysql拆分字符串作为查询条件的示例代码
js导入excel&导出excel
不要再说微服务可以解决一切问题了
Coscon'22 community convening order is coming! Open the world, invite all communities to embrace open source and open a new world~
Motion capture for snake motion analysis and snake robot development
浅谈网络安全之文件上传
Matlab tips (27) grey prediction
B站大佬用我的世界搞出卷积神经网络,LeCun转发!爆肝6个月,播放破百万
企業不想換掉用了十年的老系統
A few suggestions for making rust library more beautiful! Have you learned?
Gpt-3 is a peer review online when it has been submitted for its own research
js對JSON數組的增删改查
Today's sleep quality record 78 points