当前位置:网站首页>Experiment 7 H.264 file analysis

Experiment 7 H.264 file analysis

2022-07-23 07:58:00 Myster_ KID

H.264 File analysis and codec implementation


H.264 It's the international organization for standardization (ISO) And the International Telecommunication Union (ITU) stay 2002 year 12 Jointly proposed by the following month MPEG-4 Next generation of digital video compression formats , It has higher coding efficiency , And pay attention to mobile and IP Network adaptation , Consider the characteristics of the channel , It can control the spread of error code .

One 、H.264 brief introduction

1. H.264 Characteristics

  • High compression ratio : Under the same image quality , use H.264 Technology compressed code stream , The amount of data is only MPEG-2 Of 1/2;
  • High fault tolerance :H.264 The code stream has strong anti error characteristics , It can adapt to high packet loss rate 、 Serious channel interference , Such as IP And wireless networks ;
  • The network is adaptable :H.264 Provides a network adaptation layer , bring H.264 Files can be easily transferred over different networks ;
  • High computational complexity :H.264 Use high computational complexity , In exchange for superior performance , Its complexity is equivalent to MPEG-2 Of 2—3 times .

2. H.264 codecs

H.264 Encoder structure and previous generation MPEG-2 There is little difference in Standards , Transform coding is mainly used 、DPCM Hybrid structure with motion compensation . The structure of codec is shown in the figure :
 Insert picture description here
 Insert picture description here

3. H.264 Technology adopted

(1) Layered design
H.264 In the video coding layer (VCL) And network extraction layer (NAL) Concept segmentation between , To achieve effective transmission in different transmission environments , Facilitate seamless connection with current and future coding formats and different types of Networks .

(2) Intra prediction coding
Predict the value of the current macroblock according to the value of adjacent blocks , Then transform the difference between the predicted value and the original value 、 Quantization and coding . For brightness blocks , Use 4×4 And 16×16 Coding mode , Adopt 8×8 Coding mode .

In brightness block 4×4 Take the intra prediction mode as an example , By surrounding 15 Pixel prediction 4×4 Of 16 Pixel , share 9 There are two prediction models , Calculate their respective SAE, Take the smallest as the prediction mode of this block .

(3) Inter prediction coding —— Block based motion compensation

  • Tree structure motion compensation : Use blocks of different sizes and shapes for motion compensation ( Large blocks are used in low-frequency areas , Use small blocks in high-frequency areas );
  • Use 1/4 Pixel precision motion vector
  • Perform motion search in multiple reference frames , Select a frame most similar to the encoded frame as the reference frame
  • introduce SI The frame and SP frame : Adapt to the requirements of bit stream bandwidth adaptation and error resistance , Improve network affinity , Support streaming media services

(4) Integer transformation
Use integers DCT Instead of MPEG-2 Medium DCT, Ensure that there is no loss of accuracy in the transformation , And the standard has a detailed description of the positive and negative transformation , There will be no mismatching . in addition , Integer transformation requires only shift and addition , Maximize computing efficiency .

(5) Quantitative processing
H.264 Standard support 52 Quantization steps , Quantizing parameters QP Every increase 6, quantization step Qstep Double . The quantization step has a wide range of values , This provides enough flexibility and accuracy for both bit rate and coding quality .

(6) Deblocking filtering
To eliminate the blocking effect ,H.264 An adaptive deblocking filter is added to the prediction loop of the coding system .

(7) Entropy coding
H.264 Two entropy coding methods are given in the standard :CAVLC(Context-based Adaptive Variable Length Coding, Adaptive variable length coding based on context ) and CABAC(Context-based Adaptive Binary Arithmetic Coding, Context based adaptive binary arithmetic coding ), The former is the basic coding method , The latter can be encoded , And the coding performance of the latter is better than the former , But the computational complexity is higher .


Two 、H.264 File analysis

Sequence parameter set and image parameter set

Sequence parameter set SPS(Sequence parameter set)

Field explain
profile_idc The encoding specification of the code stream .
level_idc Identify the current code stream level. Coded Level Defines the maximum video resolution under certain conditions 、 Maximum video frame rate and other parameters , The code stream follows level from level_idc Appoint .
seq_parameter_set_id Represents the current sequence parameter set id. Through the id value , Image parameter set pps You can quote the sps Parameters in .
pic_order_cnt_type It means decoding picture order count(POC) Methods .
log2_max_pic_order_cnt_lsb_minus4 Used to calculate MaxPicOrderCntLsb Value , This value represents POC Upper limit . The calculation method is as follows MaxPicOrderCntLsb = 2^(log2_max_pic_order_cnt_lsb_minus4 + 4)
num_ref_frames It specifies the short-term reference frame and long-term reference frame that may be used in the decoding process of any image inter frame prediction in the video sequence 、 The maximum number of complementary reference field pairs and unpaired reference fields .
pic_width_in_mbs_minus1 Add 1 Refers to the width of each decoded image with macroblock as unit .
pic_height_in_map_units_minus1 Add 1 It refers to the height of each decoded image with macroblock as unit .
frame_mbs_only_flag Identification bit , Describe the encoding method of the macroblock . When the identification bit is 0 when , Macroblocks may be frame encoded or field encoded ; The identification bit is 1 when , All macroblocks are frame encoded .
direct_8x8_inference_flag Identification bit , be used for B_Skip、B_Direct Derivation and calculation of mode motion vector .
gaps_in_frame_num_value_allowed_flag Identification bit , explain frame_num Whether discontinuous values are allowed in .
frame_cropping_flag Identification bit , Specify whether to crop the output image frame .
vui_parameters_present_flag Identification bit , explain SPS Whether there is VUI Information .

Image parameter set PPS(Picture parameter set)

Field explain
pic_parameter_set_idprofile_idc At present PPS Of id. Some PPS In the code stream, the corresponding slice quote ,slice quote PPS The way is in Slice header Kept in PPS Of id value . The range of values is [0,255]. The encoding specification of the code stream .
seq_parameter_set_idlevel_idc At present PPS The referenced active SPS Of id. In this way ,PPS You can also get the corresponding SPS Parameters in . The range of values is [0,31]. Identify the current code stream level. Coded Level Defines the maximum video resolution under certain conditions 、 Maximum video frame rate and other parameters , The code stream follows level from level_idc Appoint .
entropy_coding_mode_flagseq_parameter_set_id Entropy coding mode identification , The identification bit represents entropy coding in the code stream / Decode the selected algorithm . For some grammatical elements , Under different coding configurations , Different entropy coding methods are selected . Represents the current sequence parameter set id. Through the id value , Image parameter set pps You can quote the sps Parameters in .
num_slice_groups_minus1pic_order_cnt_type Indicates that in a frame slice group The number of . When the value is 0 when , All in a frame slice All belong to one slice group.slice group Is the combination of macroblocks in a frame , Defined in the 3.141 part . It means decoding picture order count(POC) Methods .
weighted_pred_flaglog2_max_pic_order_cnt_lsb_minus4 Identification bit , It means that P/SP slice Whether to enable weighted forecast in . Used to calculate MaxPicOrderCntLsb Value , This value represents POC Upper limit . The calculation method is as follows MaxPicOrderCntLsb = 2^(log2_max_pic_order_cnt_lsb_minus4 + 4)
weighted_bipred_idcnum_ref_frames It means that B Slice The method of weighted prediction in , The value range is [0,2].0 Indicates the default weighted forecast ,1 Represents an explicit weighted prediction ,2 Represents an implicitly weighted prediction . It specifies the short-term reference frame and long-term reference frame that may be used in the decoding process of any image inter frame prediction in the video sequence 、 The maximum number of complementary reference field pairs and unpaired reference fields .
pic_init_qp_minus26 and pic_init_qs_minus26pic_width_in_mbs_minus1 Represents the initial quantization parameter . The actual quantization parameter is determined by the parameter 、slice header Medium slice_qp_delta/slice_qs_delta To calculate the . Add 1 Refers to the width of each decoded image with macroblock as unit .
chroma_qp_index_offsetpic_height_in_map_units_minus1 Quantization parameters for calculating chrominance components , The value range is [-12,12]. Add 1 It refers to the height of each decoded image with macroblock as unit .
deblocking_filter_control_present_flagframe_mbs_only_flag Identification bit , Used to represent Slice header Whether there is information for deblocking filter control in . When the flag bit is 1 when ,slice header It contains the corresponding information of deblocking filter ; When the identification bit is 0 when ,slice header There is no corresponding information in .

Analyze with bitstream analyzer .mp4 file

see movie.mp4 The first frame of SPS Information :
 Insert picture description here

  • profile_idc:100, identification high profile
  • level_idc:31, Corresponding level3.1, The maximum number of macro blocks processed per second is 108000 individual , The maximum number of macroblocks per frame 3600
  • high profile The next highest bit rate is 17.5Mbps
  • seq_parameter_set_id:0, Current sequence parameter set id by 0.
  • log2_max_pic_order_cnt_lsb_minus4:2, From this we can calculate POC Cap of 2^(2+4)=64
  • pic_order_cnt_type:0
  • num_ref_frames:16, The maximum number of reference frames is 16
  • pic_width_in_mbs_minus1:39, The width of each macroblock is 40
  • pic_height_in_map_minus1:22, The height of an image is 23
  • gaps_in_frame_num_value_allowed_flag:0,frame_num Discontinuous values are not allowed in
  • frame_mbs_only_flag:1, All macroblocks are frame encoded
  • frame_cropping_flag:0, Output image frames are not cropped
  • vui_parameters_present_flag:1,SPS in VUI Information

PPS Information :
 Insert picture description here

  • pic_parameter_set_id:0, At present PPS Of id by 0
  • seq_parameter_set_id:0, At present PPS Referenced activation SPS Of id by 0, The aforementioned SPS
  • entropy_coding_mode_flag:1, The entropy coding mode is marked as 1
  • num_slice_groups_minus1:0, All in a frame slice All belong to one slice group
  • weighted_pred_flag:1, Turn on weighted prediction
  • weighted_bipred_idc:2,B slice Implicit weighted prediction is used in
  • pic_init_qp_minus26:0, Initial quantization parameters
  • pic_init_qs_minus26:0, Initial quantization parameters
  • chroma_qp_index_offset:-2, Used to calculate chromaticity components
  • deblocking_filter_control_present_flag:1,slice header It contains the corresponding information of deblocking filter
  • constrained_intra_pred_flag:0,I Macroblocks can be used from Inter Type macro block information
  • redundant_pic_cnt_present_flag:0,slice header There is no redundant_pic_cnt The corresponding information of grammatical elements

Sum up , The image width includes the number of macroblocks 40, The image height contains the number of macroblocks 23, Frame height 23x16=368, Frame width 40x16=640, The resolution of the 640x368, Frame rate fps=96kbps, be-all slice All belong to one slice group


analysis GOP structure

GOP(Group of Pictures, Image group ) It is a group composed of several consecutive images in an image sequence , It is to edit the encoded video stream 、 The basic unit of access and compression coding , Frames with different kinds of encoding .

increase GOP Or improve GOP in P/B Proportion of frames , It can improve the compression ratio , Reduce bit rate . So in general , Under the condition of constant bit rate ,GOP The bigger it is , The better the image quality (P/B The proportion of frames is larger ); Under the condition of certain image quality ,GOP The bigger it is , The lower the bit rate .


3、 ... and 、 decode

stay ldecod In the project , open decoder_test.c, You can modify the input and output file names , Such as :

#define BITSTREAM_FILENAME "highway_qcif.264"
#define DECRECON_FILENAME "highway_qcif_dec.yuv"

Can be .264 The file is decoded as .yuv file :
 Insert picture description here


Four 、 code

1. Encoding parameters

That's ok Parameters meaning Value
13/56/57InputFile/ OutputFile/ ReconFile Input file / The output file (.264)/ Rebuild file (.yuv)
30/31/ 33/34SourceWidth/SourceHight/ OutputWidth/OutputHeight Input 、 Input the width of the video sequence 、 high
16FramesToBeEncoded Number of coded frames
72IntraPeriodGOP Inside I The period of the frame 0 Means only GOP The first frame of is I frame
73IDRPeriodIDR The period of the frame , Express GOP length 0 Means only GOP The first frame of is I frame
77EnableIDRGOP Whether to allow IDR frame 0 To disable ,1 Is allowed
78EnableOpenGOP Whether it is allowed to open GOP0 To disable ,1 Is allowed
180NumberBFrames Two I/P Between frames B The number of frames
347PrimaryGOPLengthGOP length for redundant allocation (1-16)
444RateControlEnable Whether to allow bit rate control 0 To disable ,1 Is allowed
445Bitrate Bit rate ( Company :bps)
453RCUpdateMode whole I Frame mode applies to mode 1, Other situations apply to the model 2 or 3, This experiment chooses 20: The original JM Rate control ;
1: Rate control algorithm applicable to all frames ;
2: stay 1 Taking into account I/P Slices The quantization parameters of ;
3: Hybrid quadratic rate control algorithm , That is, the real-time bit rate is considered in the control process

2. Encoder debugging

In the following experiments , We all set IntraPeriod = 0,PrimaryGOPLength = IDRPeriod, To control variables .

Select the bit rate as 1600、1650、1700、1750、1800 kbps, Then set each bit rate separately GOP The format is GOP15(2B)、GOP12(2B)、GOP9(2B)、GOP4(1B)、GOP12( nothing B) and GOP1( whole I) Encoding .

function lencode project , Yes miss.yuv Encoding , The operation results are as follows :
 Insert picture description here
With GOP12(0B)@1800 kbps For example , You can see from the table that GOP The size is 12, And you can see the frame rearrangement .


5、 ... and 、 Result analysis

1. Code stream analysis

Use Elecard StreamEye Tools The code stream analysis software analyzes the encoded content .

With GOP15(2B) For example :

 Insert picture description here
 Insert picture description here
Each rectangle in the figure represents a frame ( red 、 blue 、 Green means I、P、B frame ), We can clearly see the code stream structure of the encoded video and can see that there is frame rearrangement ( The display order is IBBPBBPBBPBBPBP).

Let's take one of them 4 frame (I、B、B、P) Analyze .
 Insert picture description here
 Insert picture description here

 Insert picture description here

 Insert picture description here
We open the display macroblock dividing line in the preview 、 Macroblock type and motion vector .

In the picture , Red 、 Orange macroblocks use intra coding ( Macroblock sizes vary slightly ); The Yellow macroblock type is Inter(B_Skip), It means the same as the previous frame , Skip no coding ; Green macroblocks use forward prediction coding , Blue macroblocks are encoded using bidirectional prediction .

In the picture , We can see :

  • 4 The background part of the frame changes very little , Therefore, the background part is not encoded ;
  • B、P The motion vector can be seen in the frame . The red line indicates forward prediction , The green line indicates backward prediction , therefore P There are only red lines in the frame ,I Frames use intra coding , No motion vector ;
  • We can also see H.264 Different shapes in 、 Macroblocks of different sizes , And the macroblock in the low-frequency region is larger . This is also mentioned earlier H.264 One of the advanced technologies adopted .

2. Objective evaluation of video quality

Now let's review the previous coding 6 in GOP Structural 5 Kind of bit rate H.264 Objective evaluation of video quality of documents , The evaluation index adopts Y Component PSNR (dB). After coding H.264 There is a certain error between the actual bit rate of the file and the target bit rate , Therefore, we draw the rate distortion curve with the actual bit rate .
 Insert picture description here
As you can see from the diagram , On the premise of the same code rate , Video quality from good to bad is :GOP12 (0B) > GOP15 (2B) > GOP12 (2B) > GOP9 (2B) > GOP4 (1B) > GOP1 (all-I).

The following table lists 6 in GOP Structure respective I、P、B Proportion of frames . For example, we can compare GOP15(2B) and GOP12(2B), both of them P The frame proportion is the same , And the former B The frame proportion is larger , Therefore, it has better image quality . On the premise of the same video quality , Analogical analysis can be carried out , No more details here .

原网站

版权声明
本文为[Myster_ KID]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/204/202207222136291119.html