当前位置:网站首页>Analysis of protobuf format of real-time barrage and historical barrage at station B

Analysis of protobuf format of real-time barrage and historical barrage at station B

2022-07-06 15:59:00 Catch the king before the thief

Reference resources :

b The format of station barrage transmission is changed from the original xml Change it to protobuf, This format is binary coded transmission , Its transmission sales are much higher than the original xml, Therefore, it has certain advantages to reduce the pressure of the network at the mobile end . But one problem is , The bullet screen in this format becomes very difficult to parse , Usually from api The data obtained is a mess directly , You need a specific way to see the real content , It's a headache .

B Station not used  protobuf The barrage interface before the Protocol

1、 What is? Protobuf

Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages.

The above passage comes from Google Protobuf Introduction to the official website , In short, it is a transmission protocol , Than xml smaller 、 faster 、 It's simpler , More information can be found in :https://developers.google.com/protocol-buffers/

2、 How to parse Protobuf Bullet curtain of

2.1 download Protoc compiler

Protoc It's used to put .proto Files are compiled into various programming languages ( Such as Python、Golang etc. ) The compiler , Carry out Protobuf Necessary conditions for analysis , It can be downloaded from the link below :https://github.com/protocolbuffers/protobuf

  After downloading, unzip it to exe file , No installation required , But it needs to be added manually to Path in .

  Determine whether the installation is successful by running the following code in the terminal :protoc --version

2.2 download Protobuf-Python In order to be in Python Chinese analysis Protobuf

Download address :https://github.com/protocolbuffers/protobuf

Unzip after download , Then enter  python Entry directory ,

  Execute the following command line code :

python setup.py clean
python setup.py build
python setup.py install
python setup.py test

2.3 Bullet screen proto Define and compile

Barrage format ,protobuf Structure :

dm.proto

syntax = "proto3";

package dm;

message DmSegMobileReply{
  repeated DanmakuElem elems = 1;
}
message DanmakuElem{
  int64 id = 1;
  int32 progress = 2;
  int32 mode = 3;
  int32 fontsize = 4;
  uint32 color = 5;
  string midHash = 6;
  string content = 7;
  int64 ctime = 8;
  int32 weight = 9;
  string action = 10;
  int32 pool = 11;
  string idStr = 12;
}
name meaning type remarks
id bullet chat dmIDint64 only Can be used for operating parameters
progress The time when the bullet screen appears in the video int32 millisecond
mode Barrage type int321 2 3: Ordinary barrage
4: Bottom barrage
5: Top barrage
6: Reverse barrage
7: Advanced barrage
8: Code barrage
9:BAS bullet chat
fontsize Bullet screen font size int3218: Small
25: standard
36: Big
color Barrage color uint32 Decimal system RGB888 value
midHash sender UID Of HASHstring Used to shield users and view all barrages sent by users You can also reverse check the user ID
content The contents of the barrage stringutf-8 code
ctime Barrage sending time int64 Time stamp
weight The weight int32 Used for intelligent shielding level
action action string Unknown
pool Barrage pool int320: Ordinary pool
1: Caption pool
2: Special pool ( Code /BAS bullet chat )
idStr bullet chat dmID String type of string only Can be used for operating parameters

2.4 analysis seg.so Bullet screen data in format

Sample video :https://www.bilibili.com/video/av98919207

Before parsing, you need to install python Of probuf package : pip install protobuf

compile proto Structure file ,

protoc --python_out=. dm.proto

After execution, it will generate dm_pb2.py, Introduce this into the code python file ,

 dm_pj.py The code is as follows :

Be careful :

  • Real time barrage Unwanted cookie, Ask directly to get seg.so 
  • Historical barrage need cookie To get it  seg.so 
# -*- coding: utf-8 -*-
# @Author  : 
# @Date    :
# @File    : dm_pj.py
# @description : XXX


import json
import requests
from dm_pb2 import DmSegMobileReply
from google.protobuf.json_format import MessageToJson, Parse


b_web_cookie = 'SESSDATA=fd25e2e6%2C1660373048%2C287c9%2A21;'


def get_date_list():
    url = "https://api.bilibili.com/x/v2/dm/history/index?type=1&oid=168855206&month=2022-02"
    headers = {
        'cookie': b_web_cookie
    }
    response = requests.get(url, headers=headers)
    print(json.dumps(response.json(), ensure_ascii=False, indent=4))


def dm_real_time():
    url_real_time = 'https://api.bilibili.com/x/v2/dm/web/seg.so?type=1&oid=168855206&pid=98919207&segment_index=1'
    resp = requests.get(url_real_time)

    DM = DmSegMobileReply()
    DM.ParseFromString(resp.content)
    data_dict = json.loads(MessageToJson(DM))
    # print(data_dict)
    list(map(lambda x=None: print(x['content']), data_dict.get('elems', [])))


def dm_history():
    url_history = 'https://api.bilibili.com/x/v2/dm/web/history/seg.so?type=1&oid=168855206&date=2022-02-23'
    headers = {
        'cookie': b_web_cookie
    }
    resp = requests.get(url_history, headers=headers)
    DM = DmSegMobileReply()
    DM.ParseFromString(resp.content)
    data_dict = json.loads(MessageToJson(DM))
    # print(data_dict)
    list(map(lambda x=None: print(x['content']), data_dict.get('elems', [])))


if __name__ == '__main__':
    # dm_real_time()
    get_date_list()
    # dm_history()
    pass

  Screenshot of execution result :

Barrage contrast :

原网站

版权声明
本文为[Catch the king before the thief]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/187/202207060919582909.html