当前位置:网站首页>Experiment 7 H.264 file analysis
Experiment 7 H.264 file analysis
2022-07-23 07:58:00 【Myster_ KID】
H.264 File analysis and codec implementation
List of articles
H.264 It's the international organization for standardization (ISO) And the International Telecommunication Union (ITU) stay 2002 year 12 Jointly proposed by the following month MPEG-4 Next generation of digital video compression formats , It has higher coding efficiency , And pay attention to mobile and IP Network adaptation , Consider the characteristics of the channel , It can control the spread of error code .
One 、H.264 brief introduction
1. H.264 Characteristics
- High compression ratio : Under the same image quality , use H.264 Technology compressed code stream , The amount of data is only MPEG-2 Of 1/2;
- High fault tolerance :H.264 The code stream has strong anti error characteristics , It can adapt to high packet loss rate 、 Serious channel interference , Such as IP And wireless networks ;
- The network is adaptable :H.264 Provides a network adaptation layer , bring H.264 Files can be easily transferred over different networks ;
- High computational complexity :H.264 Use high computational complexity , In exchange for superior performance , Its complexity is equivalent to MPEG-2 Of 2—3 times .
2. H.264 codecs
H.264 Encoder structure and previous generation MPEG-2 There is little difference in Standards , Transform coding is mainly used 、DPCM Hybrid structure with motion compensation . The structure of codec is shown in the figure :

3. H.264 Technology adopted
(1) Layered design
H.264 In the video coding layer (VCL) And network extraction layer (NAL) Concept segmentation between , To achieve effective transmission in different transmission environments , Facilitate seamless connection with current and future coding formats and different types of Networks .
(2) Intra prediction coding
Predict the value of the current macroblock according to the value of adjacent blocks , Then transform the difference between the predicted value and the original value 、 Quantization and coding . For brightness blocks , Use 4×4 And 16×16 Coding mode , Adopt 8×8 Coding mode .
In brightness block 4×4 Take the intra prediction mode as an example , By surrounding 15 Pixel prediction 4×4 Of 16 Pixel , share 9 There are two prediction models , Calculate their respective SAE, Take the smallest as the prediction mode of this block .
(3) Inter prediction coding —— Block based motion compensation
- Tree structure motion compensation : Use blocks of different sizes and shapes for motion compensation ( Large blocks are used in low-frequency areas , Use small blocks in high-frequency areas );
- Use 1/4 Pixel precision motion vector
- Perform motion search in multiple reference frames , Select a frame most similar to the encoded frame as the reference frame
- introduce SI The frame and SP frame : Adapt to the requirements of bit stream bandwidth adaptation and error resistance , Improve network affinity , Support streaming media services
(4) Integer transformation
Use integers DCT Instead of MPEG-2 Medium DCT, Ensure that there is no loss of accuracy in the transformation , And the standard has a detailed description of the positive and negative transformation , There will be no mismatching . in addition , Integer transformation requires only shift and addition , Maximize computing efficiency .
(5) Quantitative processing
H.264 Standard support 52 Quantization steps , Quantizing parameters QP Every increase 6, quantization step Qstep Double . The quantization step has a wide range of values , This provides enough flexibility and accuracy for both bit rate and coding quality .
(6) Deblocking filtering
To eliminate the blocking effect ,H.264 An adaptive deblocking filter is added to the prediction loop of the coding system .
(7) Entropy coding
H.264 Two entropy coding methods are given in the standard :CAVLC(Context-based Adaptive Variable Length Coding, Adaptive variable length coding based on context ) and CABAC(Context-based Adaptive Binary Arithmetic Coding, Context based adaptive binary arithmetic coding ), The former is the basic coding method , The latter can be encoded , And the coding performance of the latter is better than the former , But the computational complexity is higher .
Two 、H.264 File analysis
Sequence parameter set and image parameter set
Sequence parameter set SPS(Sequence parameter set)
| Field | explain |
|---|---|
| profile_idc | The encoding specification of the code stream . |
| level_idc | Identify the current code stream level. Coded Level Defines the maximum video resolution under certain conditions 、 Maximum video frame rate and other parameters , The code stream follows level from level_idc Appoint . |
| seq_parameter_set_id | Represents the current sequence parameter set id. Through the id value , Image parameter set pps You can quote the sps Parameters in . |
| pic_order_cnt_type | It means decoding picture order count(POC) Methods . |
| log2_max_pic_order_cnt_lsb_minus4 | Used to calculate MaxPicOrderCntLsb Value , This value represents POC Upper limit . The calculation method is as follows MaxPicOrderCntLsb = 2^(log2_max_pic_order_cnt_lsb_minus4 + 4) |
| num_ref_frames | It specifies the short-term reference frame and long-term reference frame that may be used in the decoding process of any image inter frame prediction in the video sequence 、 The maximum number of complementary reference field pairs and unpaired reference fields . |
| pic_width_in_mbs_minus1 | Add 1 Refers to the width of each decoded image with macroblock as unit . |
| pic_height_in_map_units_minus1 | Add 1 It refers to the height of each decoded image with macroblock as unit . |
| frame_mbs_only_flag | Identification bit , Describe the encoding method of the macroblock . When the identification bit is 0 when , Macroblocks may be frame encoded or field encoded ; The identification bit is 1 when , All macroblocks are frame encoded . |
| direct_8x8_inference_flag | Identification bit , be used for B_Skip、B_Direct Derivation and calculation of mode motion vector . |
| gaps_in_frame_num_value_allowed_flag | Identification bit , explain frame_num Whether discontinuous values are allowed in . |
| frame_cropping_flag | Identification bit , Specify whether to crop the output image frame . |
| vui_parameters_present_flag | Identification bit , explain SPS Whether there is VUI Information . |
Image parameter set PPS(Picture parameter set)
| Field | explain |
|---|---|
| pic_parameter_set_idprofile_idc | At present PPS Of id. Some PPS In the code stream, the corresponding slice quote ,slice quote PPS The way is in Slice header Kept in PPS Of id value . The range of values is [0,255]. The encoding specification of the code stream . |
| seq_parameter_set_idlevel_idc | At present PPS The referenced active SPS Of id. In this way ,PPS You can also get the corresponding SPS Parameters in . The range of values is [0,31]. Identify the current code stream level. Coded Level Defines the maximum video resolution under certain conditions 、 Maximum video frame rate and other parameters , The code stream follows level from level_idc Appoint . |
| entropy_coding_mode_flagseq_parameter_set_id | Entropy coding mode identification , The identification bit represents entropy coding in the code stream / Decode the selected algorithm . For some grammatical elements , Under different coding configurations , Different entropy coding methods are selected . Represents the current sequence parameter set id. Through the id value , Image parameter set pps You can quote the sps Parameters in . |
| num_slice_groups_minus1pic_order_cnt_type | Indicates that in a frame slice group The number of . When the value is 0 when , All in a frame slice All belong to one slice group.slice group Is the combination of macroblocks in a frame , Defined in the 3.141 part . It means decoding picture order count(POC) Methods . |
| weighted_pred_flaglog2_max_pic_order_cnt_lsb_minus4 | Identification bit , It means that P/SP slice Whether to enable weighted forecast in . Used to calculate MaxPicOrderCntLsb Value , This value represents POC Upper limit . The calculation method is as follows MaxPicOrderCntLsb = 2^(log2_max_pic_order_cnt_lsb_minus4 + 4) |
| weighted_bipred_idcnum_ref_frames | It means that B Slice The method of weighted prediction in , The value range is [0,2].0 Indicates the default weighted forecast ,1 Represents an explicit weighted prediction ,2 Represents an implicitly weighted prediction . It specifies the short-term reference frame and long-term reference frame that may be used in the decoding process of any image inter frame prediction in the video sequence 、 The maximum number of complementary reference field pairs and unpaired reference fields . |
| pic_init_qp_minus26 and pic_init_qs_minus26pic_width_in_mbs_minus1 | Represents the initial quantization parameter . The actual quantization parameter is determined by the parameter 、slice header Medium slice_qp_delta/slice_qs_delta To calculate the . Add 1 Refers to the width of each decoded image with macroblock as unit . |
| chroma_qp_index_offsetpic_height_in_map_units_minus1 | Quantization parameters for calculating chrominance components , The value range is [-12,12]. Add 1 It refers to the height of each decoded image with macroblock as unit . |
| deblocking_filter_control_present_flagframe_mbs_only_flag | Identification bit , Used to represent Slice header Whether there is information for deblocking filter control in . When the flag bit is 1 when ,slice header It contains the corresponding information of deblocking filter ; When the identification bit is 0 when ,slice header There is no corresponding information in . |
Analyze with bitstream analyzer .mp4 file
see movie.mp4 The first frame of SPS Information :
- profile_idc:100, identification high profile
- level_idc:31, Corresponding level3.1, The maximum number of macro blocks processed per second is 108000 individual , The maximum number of macroblocks per frame 3600
- high profile The next highest bit rate is 17.5Mbps
- seq_parameter_set_id:0, Current sequence parameter set id by 0.
- log2_max_pic_order_cnt_lsb_minus4:2, From this we can calculate POC Cap of 2^(2+4)=64
- pic_order_cnt_type:0
- num_ref_frames:16, The maximum number of reference frames is 16
- pic_width_in_mbs_minus1:39, The width of each macroblock is 40
- pic_height_in_map_minus1:22, The height of an image is 23
- gaps_in_frame_num_value_allowed_flag:0,frame_num Discontinuous values are not allowed in
- frame_mbs_only_flag:1, All macroblocks are frame encoded
- frame_cropping_flag:0, Output image frames are not cropped
- vui_parameters_present_flag:1,SPS in VUI Information
PPS Information :
- pic_parameter_set_id:0, At present PPS Of id by 0
- seq_parameter_set_id:0, At present PPS Referenced activation SPS Of id by 0, The aforementioned SPS
- entropy_coding_mode_flag:1, The entropy coding mode is marked as 1
- num_slice_groups_minus1:0, All in a frame slice All belong to one slice group
- weighted_pred_flag:1, Turn on weighted prediction
- weighted_bipred_idc:2,B slice Implicit weighted prediction is used in
- pic_init_qp_minus26:0, Initial quantization parameters
- pic_init_qs_minus26:0, Initial quantization parameters
- chroma_qp_index_offset:-2, Used to calculate chromaticity components
- deblocking_filter_control_present_flag:1,slice header It contains the corresponding information of deblocking filter
- constrained_intra_pred_flag:0,I Macroblocks can be used from Inter Type macro block information
- redundant_pic_cnt_present_flag:0,slice header There is no redundant_pic_cnt The corresponding information of grammatical elements
Sum up , The image width includes the number of macroblocks 40, The image height contains the number of macroblocks 23, Frame height 23x16=368, Frame width 40x16=640, The resolution of the 640x368, Frame rate fps=96kbps, be-all slice All belong to one slice group
analysis GOP structure
GOP(Group of Pictures, Image group ) It is a group composed of several consecutive images in an image sequence , It is to edit the encoded video stream 、 The basic unit of access and compression coding , Frames with different kinds of encoding .
increase GOP Or improve GOP in P/B Proportion of frames , It can improve the compression ratio , Reduce bit rate . So in general , Under the condition of constant bit rate ,GOP The bigger it is , The better the image quality (P/B The proportion of frames is larger ); Under the condition of certain image quality ,GOP The bigger it is , The lower the bit rate .
3、 ... and 、 decode
stay ldecod In the project , open decoder_test.c, You can modify the input and output file names , Such as :
#define BITSTREAM_FILENAME "highway_qcif.264"
#define DECRECON_FILENAME "highway_qcif_dec.yuv"
Can be .264 The file is decoded as .yuv file :
Four 、 code
1. Encoding parameters
| That's ok | Parameters | meaning | Value |
|---|---|---|---|
| 13/56/57 | InputFile/ OutputFile/ ReconFile | Input file / The output file (.264)/ Rebuild file (.yuv) | |
| 30/31/ 33/34 | SourceWidth/SourceHight/ OutputWidth/OutputHeight | Input 、 Input the width of the video sequence 、 high | |
| 16 | FramesToBeEncoded | Number of coded frames | |
| 72 | IntraPeriod | GOP Inside I The period of the frame | 0 Means only GOP The first frame of is I frame |
| 73 | IDRPeriod | IDR The period of the frame , Express GOP length | 0 Means only GOP The first frame of is I frame |
| 77 | EnableIDRGOP | Whether to allow IDR frame | 0 To disable ,1 Is allowed |
| 78 | EnableOpenGOP | Whether it is allowed to open GOP | 0 To disable ,1 Is allowed |
| 180 | NumberBFrames | Two I/P Between frames B The number of frames | |
| 347 | PrimaryGOPLength | GOP length for redundant allocation (1-16) | |
| 444 | RateControlEnable | Whether to allow bit rate control | 0 To disable ,1 Is allowed |
| 445 | Bitrate | Bit rate ( Company :bps) | |
| 453 | RCUpdateMode | whole I Frame mode applies to mode 1, Other situations apply to the model 2 or 3, This experiment chooses 2 | 0: The original JM Rate control ; 1: Rate control algorithm applicable to all frames ; 2: stay 1 Taking into account I/P Slices The quantization parameters of ; 3: Hybrid quadratic rate control algorithm , That is, the real-time bit rate is considered in the control process |
2. Encoder debugging
In the following experiments , We all set IntraPeriod = 0,PrimaryGOPLength = IDRPeriod, To control variables .
Select the bit rate as 1600、1650、1700、1750、1800 kbps, Then set each bit rate separately GOP The format is GOP15(2B)、GOP12(2B)、GOP9(2B)、GOP4(1B)、GOP12( nothing B) and GOP1( whole I) Encoding .
function lencode project , Yes miss.yuv Encoding , The operation results are as follows :
With GOP12(0B)@1800 kbps For example , You can see from the table that GOP The size is 12, And you can see the frame rearrangement .
5、 ... and 、 Result analysis
1. Code stream analysis
Use Elecard StreamEye Tools The code stream analysis software analyzes the encoded content .
With GOP15(2B) For example :


Each rectangle in the figure represents a frame ( red 、 blue 、 Green means I、P、B frame ), We can clearly see the code stream structure of the encoded video and can see that there is frame rearrangement ( The display order is IBBPBBPBBPBBPBP).
Let's take one of them 4 frame (I、B、B、P) Analyze .



We open the display macroblock dividing line in the preview 、 Macroblock type and motion vector .
In the picture , Red 、 Orange macroblocks use intra coding ( Macroblock sizes vary slightly ); The Yellow macroblock type is Inter(B_Skip), It means the same as the previous frame , Skip no coding ; Green macroblocks use forward prediction coding , Blue macroblocks are encoded using bidirectional prediction .
In the picture , We can see :
- 4 The background part of the frame changes very little , Therefore, the background part is not encoded ;
- B、P The motion vector can be seen in the frame . The red line indicates forward prediction , The green line indicates backward prediction , therefore P There are only red lines in the frame ,I Frames use intra coding , No motion vector ;
- We can also see H.264 Different shapes in 、 Macroblocks of different sizes , And the macroblock in the low-frequency region is larger . This is also mentioned earlier H.264 One of the advanced technologies adopted .
2. Objective evaluation of video quality
Now let's review the previous coding 6 in GOP Structural 5 Kind of bit rate H.264 Objective evaluation of video quality of documents , The evaluation index adopts Y Component PSNR (dB). After coding H.264 There is a certain error between the actual bit rate of the file and the target bit rate , Therefore, we draw the rate distortion curve with the actual bit rate .
As you can see from the diagram , On the premise of the same code rate , Video quality from good to bad is :GOP12 (0B) > GOP15 (2B) > GOP12 (2B) > GOP9 (2B) > GOP4 (1B) > GOP1 (all-I).
The following table lists 6 in GOP Structure respective I、P、B Proportion of frames . For example, we can compare GOP15(2B) and GOP12(2B), both of them P The frame proportion is the same , And the former B The frame proportion is larger , Therefore, it has better image quality . On the premise of the same video quality , Analogical analysis can be carried out , No more details here .
边栏推荐
- 1.10 API 和字符串
- Fledgling Xiao Li's 108th blog binary print
- etcdv3·watch操作实现及相关重点说明
- 1.11 ArrayList&学生管理系统
- How to use the order flow analysis tool (Part 2)
- Redis三种集群方案
- 无代码生产新模式探索
- 我为OpenHarmony 写代码,战“码”先锋第二期正式开启!
- Three effective strategies for driving page performance optimization
- Information system project managers must recite the core examination points (49) contract law
猜你喜欢

文件上传,服务器文件名中文乱码文件上传,服务器文件名中文乱码

VScode配置用户代码片段

(五)数电——公式化简法

记一次线上SQL死锁事故:如何避免死锁?

Qt文档阅读笔记-QAudioInput&QAudioFormat解析与实例

Why does MySQL index use b+ tree instead of jump table?

VMware虚拟机更改静态IP和主机名,使用Xshell进行连接

Application of workflow engine in vivo marketing automation

ProSci LAG3抗体:改善体外研究,助力癌症免疫治疗

2022年暑假ACM热身练习4(总结)
随机推荐
Graduation project ----- Internet of things environment detection system based on stm32
ROS2常用命令行工具整理ROS2CLI
ASP. Net core creates MVC projects and uploads multiple files (streaming)
I use the factory mode in jd.com and explain the factory mode clearly
Overview of multisensor fusion -- FOV and bev
实验六 MPEG
[record of question brushing] 18. Sum of four numbers
Talking about performance optimization: analysis and optimization of APP startup process
SLAAC 无状态地址自动配置
多传感器融合综述---FOV与BEV
6-13漏洞利用-smtp暴力破解
@Transactional事务方法中包含多个同类事务方法,这些事务方法本身设置失效两种解决方案
自定义flink es source
6-14漏洞利用-rpcbind漏洞利用
pny 文件转图片
The Chinese and English dot matrix character display principle of the 111th blog of the fledgling Xiao Li
21 -- 除自身以外数组的乘积
Mysql的索引为什么用B+树而不是跳表?
Qt+VTK+PCL图片转灰度图且以灰度为Y轴显示
scala idea提示函数参数