当前位置:网站首页>b站 實時彈幕和曆史彈幕 Protobuf 格式解析
b站 實時彈幕和曆史彈幕 Protobuf 格式解析
2022-07-06 15:59:00 【擒賊先擒王】
參考:
- https://zhuanlan.zhihu.com/p/392931611
- https://gitee.com/nbody1996/bilibili-API-collect/blob/master/danmaku/danmaku_proto.md
- Bilibili 曆史彈幕:https://www.cnblogs.com/mollnn/p/14964905.html
b站彈幕傳輸的格式由原來的 xml 改為了 protobuf,這個格式為二進制編碼傳輸,其傳輸銷量遠高於原來的 xml,因此在移動端可以减小網絡的壓力具有一定的優勢。但帶來的一個問題就是,這個格式的彈幕解析起來變得十分困難,通常從 api 獲得的數據直接看是一通亂碼,需要特定的方式才能看到真正的內容,讓人比較頭疼。
B站沒有使用 protobuf 協議前的彈幕接口
1、什麼是 Protobuf
Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages.
上面這段話來自穀歌 Protobuf 官網的介紹,簡單來講就是一種傳輸的協議,比 xml 更小、更快、更簡單,更多信息可以見:https://developers.google.com/protocol-buffers/
2、如何解析 Protobuf 的彈幕
2.1 下載 Protoc 編譯器
Protoc 是用於將 .proto 文件編譯成各種編程語言(如 Python、Golang 等)的編譯器,是進行 Protobuf 解析的必要條件,可在下面的鏈接中下載:https://github.com/protocolbuffers/protobuf


下載完成後解壓出來是 exe 文件,不需要安裝,但是需要手動添加到 Path 中。

通過在終端中運行如下代碼來確定是否安裝成功:protoc --version

2.2 下載 Protobuf-Python 以便在 Python 中解析 Protobuf
下載地址:https://github.com/protocolbuffers/protobuf
下載完成後解壓,然後進入 python 進入目錄,

執行以下命令行代碼:
python setup.py clean
python setup.py build
python setup.py install
python setup.py test2.3 彈幕的 proto 定義並編譯
彈幕格式,protobuf 結構體:
dm.proto
syntax = "proto3";
package dm;
message DmSegMobileReply{
repeated DanmakuElem elems = 1;
}
message DanmakuElem{
int64 id = 1;
int32 progress = 2;
int32 mode = 3;
int32 fontsize = 4;
uint32 color = 5;
string midHash = 6;
string content = 7;
int64 ctime = 8;
int32 weight = 9;
string action = 10;
int32 pool = 11;
string idStr = 12;
}| 名稱 | 含義 | 類型 | 備注 |
|---|---|---|---|
| id | 彈幕dmID | int64 | 唯一 可用於操作參數 |
| progress | 視頻內彈幕出現時間 | int32 | 毫秒 |
| mode | 彈幕類型 | int32 | 1 2 3:普通彈幕 4:底部彈幕 5:頂部彈幕 6:逆向彈幕 7:高級彈幕 8:代碼彈幕 9:BAS彈幕 |
| fontsize | 彈幕字號 | int32 | 18:小 25:標准 36:大 |
| color | 彈幕顏色 | uint32 | 十進制RGB888值 |
| midHash | 發送者UID的HASH | string | 用於屏蔽用戶和查看用戶發送的所有彈幕 也可反查用戶ID |
| content | 彈幕內容 | string | utf-8編碼 |
| ctime | 彈幕發送時間 | int64 | 時間戳 |
| weight | 權重 | int32 | 用於智能屏蔽級別 |
| action | 動作 | string | 未知 |
| pool | 彈幕池 | int32 | 0:普通池 1:字幕池 2:特殊池(代碼/BAS彈幕) |
| idStr | 彈幕dmID的字符串類型 | string | 唯一 可用於操作參數 |
2.4 解析 seg.so 格式的彈幕數據
示例視頻:https://www.bilibili.com/video/av98919207
解析之前需要先安裝 python 的 probuf 包: pip install protobuf

編譯 proto 結構文件,
protoc --python_out=. dm.proto執行完成後會生成 dm_pb2.py,代碼中引入這個 python 文件,

dm_pj.py 代碼如下:
注意:
- 實時彈幕 不需要 cookie,直接請求即可得到 seg.so
- 曆史彈幕 需要 cookie 才能得到 seg.so
# -*- coding: utf-8 -*-
# @Author :
# @Date :
# @File : dm_pj.py
# @description : XXX
import json
import requests
from dm_pb2 import DmSegMobileReply
from google.protobuf.json_format import MessageToJson, Parse
b_web_cookie = 'SESSDATA=fd25e2e6%2C1660373048%2C287c9%2A21;'
def get_date_list():
url = "https://api.bilibili.com/x/v2/dm/history/index?type=1&oid=168855206&month=2022-02"
headers = {
'cookie': b_web_cookie
}
response = requests.get(url, headers=headers)
print(json.dumps(response.json(), ensure_ascii=False, indent=4))
def dm_real_time():
url_real_time = 'https://api.bilibili.com/x/v2/dm/web/seg.so?type=1&oid=168855206&pid=98919207&segment_index=1'
resp = requests.get(url_real_time)
DM = DmSegMobileReply()
DM.ParseFromString(resp.content)
data_dict = json.loads(MessageToJson(DM))
# print(data_dict)
list(map(lambda x=None: print(x['content']), data_dict.get('elems', [])))
def dm_history():
url_history = 'https://api.bilibili.com/x/v2/dm/web/history/seg.so?type=1&oid=168855206&date=2022-02-23'
headers = {
'cookie': b_web_cookie
}
resp = requests.get(url_history, headers=headers)
DM = DmSegMobileReply()
DM.ParseFromString(resp.content)
data_dict = json.loads(MessageToJson(DM))
# print(data_dict)
list(map(lambda x=None: print(x['content']), data_dict.get('elems', [])))
if __name__ == '__main__':
# dm_real_time()
get_date_list()
# dm_history()
pass
執行結果截圖:

彈幕對比:

边栏推荐
- Penetration test (2) -- penetration test system, target, GoogleHacking, Kali tool
- X-Forwarded-For详解、如何获取到客户端IP
- Market trend report, technical innovation and market forecast of lip care products in China and Indonesia
- 【练习-4】(Uva 11988)Broken Keyboard(破损的键盘) ==(链表)
- China's salt water membrane market trend report, technological innovation and market forecast
- 渗透测试 ( 5 ) --- 扫描之王 nmap、渗透测试工具实战技巧合集
- Ball Dropping
- Essai de pénétration (1) - - outils nécessaires, navigation
- Research Report on market supply and demand and strategy of China's land incineration plant industry
- Cost accounting [13]
猜你喜欢
随机推荐
Cost accounting [16]
1010 things that college students majoring in it must do before graduation
Ball Dropping
【高老师软件需求分析】20级云班课习题答案合集
Penetration testing (5) -- a collection of practical skills of scanning King nmap and penetration testing tools
STM32 learning record: LED light flashes (register version)
Opencv learning log 15 count the number of solder joints and output
mysql导入数据库报错 [Err] 1273 – Unknown collation: ‘utf8mb4_0900_ai_ci’
[exercise-4] (UVA 11988) broken keyboard = = (linked list)
0-1背包问题(一)
Information security - Epic vulnerability log4j vulnerability mechanism and preventive measures
VS2019初步使用
0-1背包問題(一)
The most complete programming language online API document
Flink 使用之 CEP
Matlab comprehensive exercise: application in signal and system
0-1 knapsack problem (I)
【练习-8】(Uva 246)10-20-30==模拟
【练习-2】(Uva 712) S-Trees (S树)
程序员的你,有哪些炫技的代码写法?









