当前位置:网站首页>Basic knowledge of video format: let you know MKV, MP4, h.265, bit rate, color depth, etc
Basic knowledge of video format: let you know MKV, MP4, h.265, bit rate, color depth, etc
2022-07-28 07:11:00 【angleoldhen】
Reprinted from : https://www.4k123.com/thread-8194-1-1.html
This tutorial will be described in detail in the following modules :
1、 Package format (MP4/MKV…) vs Media format (H.264/FLAC/AAC…)
2、 Basic parameters of video : The resolution of the , Frame rate and code rate
3、 Representation of images :RGB Model vs YUV Model
4、 Color depth
5、 Chromaticity half sampling
6、 Low frequency and high frequency in space : Plane , Textures and lines
7、 Low frequency and high frequency in time : dynamic
8、 Brief description of clarity and image quality
1、 Package format (MP4/MKV…) vs Media format (H.264/FLAC/AAC…)
MP4+MKV It is the most common type of video files you download . These documents are actually like a package , Its suffix is the packaging method of the package . Inside these packages , Contains video ( Only images ), Audio ( Only the sound ), Subtitles, etc . When the player is playing , First unpack this package ( The technical term is separation /splitting), Put the video 、 Take out the audio, etc , Play again .
Since they are just a package , It means that this suffix cannot guarantee what is inside , There is no guarantee of how many things there are . Everything in the package , We call it orbit (track), There are usually so :
video (Video): Generally speaking, there must be , But there are exceptions , such as mka Format plug-in audio track , In fact, there is no video mkv. Notice when we talk about video , It doesn't include sound .
Audio (audio): Generally speaking, there must be , But some cases are silent , There is no need to bring .
chapter (Chapter): The segmentation information in the original Blu ray Disc . If you bring the file , Then you can see the effect with chapters in the player :
.potplayer Right click the screen , Options - Play - Show bookmarks on the progress bar / Chapter marking
.mpc-hc Right click the screen , Options - Adjust the - Show chapter marks in the progress bar
subtitle (Subtitles): Sometimes files come with subtitles , And subtitles are not hard subtitles that are directly integrated into the video , Then it is packaged together in the packaging container .
There may be other accessories , Not to list . There is not necessarily only one track for each type , For example, we often see people with multiple tracks MKV.
Each track , They all have their own format . For example, we often say , The video is H.264, Audio is AAC, These are the formats of each track .
Format of video , Common are H.264( Can be subdivided into 8bit/10bit),H.265( At present, there are also 8bit/10bit Points ),RealVideo( Common in early rm/rmvb),VC-1( Microsoft led , Common in wmv). Basically ,H.264=AVC=AVC1, H.265=HEVC.
Audio format , Common are FLAC/ALAC/TrueHD/DTS-HD MA These four kinds of non-destructive , and AAC/MP3/AC3/DTS(Core) These four are detrimental .
MKV vs MP4, The main difference is this :
- MKV Support encapsulation FLAC As audio ,MP4 Do not support . however MP4 You can also package lossless audio tracks ( for instance ALAC, Although it is generally believed that ALAC Is not as efficient as FLAC good )
- MKV Support encapsulation ASS/SSA Format subtitles ,MP4 Do not support . The subtitles produced by the general subtitle group are ASS Format , Therefore, inner captions are often seen in MKV Format
- MP4 As an industrial standard , The compatibility between video editing software and playback devices is generally better than MKV. This is also vcb-s Those videos optimized for mobile devices basically choose MP4 The reason for encapsulation .
besides , These two formats can largely replace each other . For example, they all support encapsulation AVC and HEVC, Include 8bit/10bit The accuracy of the . therefore MP4 The picture quality is not as good as MKV good , This assertion is very ignorant —— They can completely encapsulate the same video .
Why are there such differences , It's historical .MKV It is private research and development , To replace the ancient AVI, So as to better support H264, The flexibility of its development and modification makes it compatible flac/ass This kind of non industrial standard format ; and MP4 Is born in a rich family , As an industrial standard , Replace older MPG, As a new generation of video / Audio packaging services .
2、 Basic parameters of video : The resolution of the , Frame rate and code rate .
Video is made up of continuous images . Every image , We call it a frame (frame). The image is composed of pixels (pixel) Composed of . How many pixels does an image have , It is called the resolution of this image . for instance 1920×1080 Image , It shows that it is composed of horizontal and vertical 1920×1080 Pixels make up . The resolution of video is the resolution of each frame of image .
A video , How many images each second consists of , Called the frame rate of this video (frame-rate). Common frame rates are 24000/1001=23.976, 30000/1001=29.970, 60000/1001=59.940, 25.000, 50.000 wait . This number is the number of images flashing in one second . such as 23.976, Namely 1001 Seconds , Yes 24000 Zhang image . The frame rate of video can be constant (cfr, Const Frame-Rate), It can also be changing (vfr, Variable Frame-Rate)
The definition of bit rate is the volume of video file divided by time . The unit is usually Kbps(Kbit/s) perhaps Mbps(Mbit/s). Be careful 1B(Byte)=8b(bit). So a 24 minute ,900MB In the video :
Volume :900MB = 900MByte = 7200Mbit
Time :24min = 1440s
Bit rate :7200/1440 = 5000 Kbps = 5Mbps
When the time of the video file is basically the same ( For example, now an episode is probably 24 minute ), The bit rate and volume are basically equivalent , Are parameters used to describe the size of video . Files with the same length and resolution , Different volumes , In fact, the code rate is different .
Code rate can also be interpreted as unit time , The total amount of data used to record video . The higher the bit rate of video , It means that the more data used to record video , The potential interpretation is that video can have better quality .( Be careful , Just potential , Later, we will analyze why high bit rate is not necessarily equal to high image quality )
3、 Color depth
Color depth (bit-depth), That's what we usually say 8bit and 10bit, It refers to the accuracy of each channel .8bit One for each channel 8bit Integers (0~255) representative ,10bit Just use 10bit Integers (0~1023) To display .16bit It is 0~65535
( Be careful , The above statement is not rigorous , When encoding video , Not necessarily used 0~255 All ranges of , It may be reserved , Only a part of it , such as 16~235. We won't expand in detail )
Your monitor is 8bit Of , Means it can show RGB Every channel 0~255 All strengths . But the color depth of the video is YUV It's dark , When it's on the air ,YUV It needs to be converted to RGB. therefore ,10bit The high accuracy of is indirect , It increases the accuracy in the calculation process , To make the final color more delicate .
How to understand 8bit Monitor , Play 10bit It is necessary :
The radius of a circle is 12.33m, Find its area , Keep two decimal places .
The precision of the radius is given to two decimal places , The result also requires two decimal places , How high does the PI accuracy need to be ? Only two decimal places ?
take pi=3.14, The area is 477.37 Square meters
take pi=3.1416, The area is 477.61 Square meters
take pi Accuracy is high enough , The area is 477.61 Square meters . So take pi=3.1416 Is enough , however 3.14 It's not enough .
In other words , Even if the accuracy of the final output is low , Nor does it mean the numbers involved in the operation , And the operation process , Low accuracy can be maintained . The final output is 8bit RGB Under the premise of ,10bit YUV Compared with 8bit YUV This is why it still has the advantage of accuracy . in fact ,8bit YUV After the transformation , The accuracy of coverage is roughly equivalent to 8bit RGB Of 26%, and 10bit The accuracy after conversion can cover 97%—— You want your family 8bit The display plays 97% The fineness of ? see 10bit Well .
8bit Insufficient precision , Mainly in areas with low brightness , Easy to form ribbons :
Notice that the circle on the right side of the picture has the same effect as the wave . This is the performance of insufficient color accuracy .
10bit Its advantage lies not only in the improvement of display accuracy , In improving video compression , Reduce distortion , relative 8bit There are also advantages . This aspect will not be expanded .
4、 Representation of images :RGB Model vs YUV Model
The primary color of light is red (Red)、 green (Green)、 blue (Blue). Modern display technology is through the combination of three primary colors with different intensities , To achieve any color of visible light . Image storage , By recording the red, green and blue intensity of each pixel , To record images , be called RGB Model (RGB Model)
Common picture formats ,PNG and BMP These two are based on RGB Model .
Under three channels , The amount of information and the degree of detail are not necessarily evenly distributed . For example, you can notice the blush on the South bird's face , stay 3 The degree of differentiation on two planes is different —— It's almost impossible to distinguish under the red plane , The main difference is the green and blue plane . White cheeks on the outside , The three colors are almost saturated ; But the red part , Only red is saturated , Green and blue are unsaturated . This is the reason why red highlights .
except RGB Model , There is also a widely used model , be called YUV Model , Also known as brightness - Chroma model (Luma-Chroma). It is through mathematical transformation , take RGB Three channels , Convert to a channel representing brightness (Y, Also known as Luma), And two channels representing chromaticity (UV, And become Chroma).
Let's take the image point as an example : A farm raises pigs and cattle , One way of counting is :( Number of pigs , The number of cattle )
But it can also be recorded like this :( Total quantity = Number of pigs + The number of cattle , Difference between = Number of pigs - The number of cattle ). There is a mathematical formula between the two methods that can be transferred to each other .
YUV The model does something similar . Through to RGB Reasonable conversion of data , Get another representation .YUV Under the model , There are different ways to do it . Take one that is used more YCbCr Model : It is the RGB Convert to a brightness (Y), and Blue chroma (Cb) as well as Red chroma (Cr). You don't need to understand the complex formula behind the transformation , Just look at the effect :
Only the brightness channel :
In the processing and storage of images and videos ,YUV Formats are generally more popular , For the following reasons :
1、 The sensitivity of human eyes to brightness is much higher than chromaticity , Therefore, the effective information seen by human eyes mainly comes from brightness .YUV The model can allocate most of the effective information to Y passageway .UV The channel records much less information . be relative to RGB The model is more evenly distributed ,YUV The model concentrates most of the available information on Y passageway , It not only reduces the amount of redundant information , It also provides convenience for compression
2、 It maintains downward compatibility with black-and-white display devices
3、 Image editing , Adjust brightness and color saturation , stay YUV More convenient under the model .
Almost all video formats , And widely used JPEG Image format , It's all based on YCbCr Model . When it's on the air , The player needs to YCbCr Information about , By calculation , Convert to RGB. This step is called rendering (Rendering)
Record of each channel , Usually expressed as an integer . such as RGB24, Namely RGB various 8 individual bit, use 0~255 (8bit Binary number range of ) To show the strength of a certain color .YUV Models are no exception , The height of each channel is also expressed by an integer .
5、 Chromaticity half sampling
stay YUV In the application of the model ,Y and UV The importance of is not equal . The actual storage and transmission of images and videos , Will usually Y Record at full resolution ,UV To halve or even 1/4 Resolution record . This method is called chromaticity half sampling (Chroma Sub-Sampling). Chromaticity half sampling can effectively reduce the transmission bandwidth , And increase UV Compressibility of plane , But the inevitable loss UV Effective information of plane .
Our usual video , The most common is 420 sampling . coordination YUV Format , Often written yuv420. This sampling is Y Keep all ,UV Only in (1/2) x (1/2) Resolution record . for instance 1920×1080 In the video , In fact, only the brightness plane is 1920×1080. Both chromaticity planes have only 960×540 The resolution of the .
Yes, of course , You can also choose not to reduce . This is called 444 sampling , perhaps yuv444.YUV All three planes are full resolution .
Doing it YUV->RGB When , First, we need to shrink UV The resolution is increased to Y The resolution of the (madVR Allow custom algorithms , stay Chroma Upscaling among ), Then switch to RGB. do RGB->YUV Transformation , Also first switch to 444(YUV The resolution is the same ), then UV Resolution reduction .
Generally available source , Including all Blu ray discs , All are 420 Sampled by . Therefore, the finished products are generally retained 420 sampling . therefore yuv420 It means that this video is 420 Sampled by yuv Format .
take 420 Make it 444 Format , You need to manually UV Increase the resolution 2×2 times . In today's madVR Wait for the renderer to pull up well UV In the case of plane , This approach is tantamount to unnecessary promotion DVD Make it into pseudo HD .
Yes, of course , Sometimes you need to be in 444/RGB Do treatment and repair under the plane , Common, such as the video itself RGB The planes do not overlap ( For example, mocha girl Sakura ), This repair process begins with UV Increase the resolution , And then go RGB, Return after repair YUV. The repaired result is equivalent to a new composition , In this case, keep 444 Format is for a reason , It is necessary to .
H264 Format encoding 444 Format , need High 4:4:4 Predictive Profile( abbreviation Hi444pp). So see Hi444pp/yuv444 Signs like that , You need to find the statement of the oppressor , Why did he do such a promotion . If you can't find a valid reason , You should assume that the author is fooling around .
6、 Low frequency and high frequency in space : Plane , Textures and lines
In video processing , Space (spatial) The concept of "picture" refers to a picture within a frame ( You can think of it as a two-dimensional space represented by a picture / Plane ). With time (temporal) relative ; The concept of time emphasizes the transformation between frames .
So let's look at this picture of brightness again :
The brightness changes quickly , Areas with large changes , We call it the high frequency region . otherwise , Areas where brightness changes slowly and unobtrusively , We call it the low frequency region .
The blue circle in the figure is a typical low-frequency region , Or it's called a plane ( The flat part ). There is little change in brightness
In the green circle , The brightness shows a jump , This high-frequency region is called line .
In the red circle , Brightness changes frequently , The range is high and low , This high-frequency region is called texture .
occasionally , Lines and textures ( High frequency area ) Collectively called lines , Plane ( Low frequency area ) Also called non line .
This is the brightness plane . Chromaticity plane , High frequency and low frequency , Concepts such as lines also apply , It is to describe the speed and severity of chromaticity change . Generally what we call “ details ”, It refers to the high-frequency information in the image .
Generally speaking , The more high-frequency information a picture has , It means that the more information this picture contains , The more data you need to record , The more computation is needed for coding . If a video contains a lot of spatial high-frequency information ( Generally speaking, there are many details in each frame ), It means that the space complexity of this video is very high .
Record a picture , The encoder needs to decide what part to give and how much bit rate . The distribution of bit rate in different parts of a graph , Space allocation called bit rate . When the distribution is better , The visual impression of the whole picture is often unified ; Common consequences of poor distribution , The line texture is acceptable , A large number of color bands and color patches appear in the background plane area ( Bit rate is excessively allocated to lines ); Or the background color is natural , Texture blur , The lines are rotten ( The bit rate is excessively allocated to non lines ).
7、 Low frequency and high frequency in time : dynamic
In video processing , Time (temporal) The concept of emphasizes the transformation between frames . Heel space (spatial) relative .
The concept of dynamics needs no explanation ; Is the intensity of image changes between frames , Change the frequency . If a video is highly dynamic , The change is dramatic , We call it time complexity , There are many high-frequency information in the time domain . Otherwise, if the video itself is soothing and more static , We call it low time complexity , There are many low-frequency information in the time domain .
Generally speaking , A video has more high-frequency information in time domain , The amount of dynamic information is large , The more data you need to record , The more computation is needed for coding . But on the other hand , Human eyes are sensitive to rapidly changing scenes , Sensitivity is not as high as static pictures ( You don't have time to observe the details carefully ), Therefore, the priority of dynamic scenes can be lower than that of static scenes . How to balance the above two points to allocate code rate , Time allocation called bit rate . When the distribution is better , Watching videos has good dynamic and static effects ; When the allocation is not good, it is often the static part, which looks ok , Dynamic part of the paste rotten ; Or the effect of the dynamic part is too good , A lot of bit rate is wasted , Cause code shortage in static part , The flaw is obvious .
Many people like to watch static screenshots , To judge the image quality of the video . From the point of view , This practice is not completely scientific —— If you think a bad frame is actually taken from a high dynamic scene , Then it's understandable that this frame is a little worse , Anyway, you don't notice when you watch , It will be better to save the bit rate for the static part .
8、 Brief description of clarity and image quality
We often talk about , How clear is a video , Is the picture quality good . But how to define these two terms ?
Often seen statement :“ The definition of this video is 1080p Of ”. In fact, after reading the above, you should know ,1080p Just the resolution of the video , It cannot directly represent clarity —— for instance , I can put one 480p Of dvd The video is pulled up to 1080p, So what ? Is its clarity improved ?
A concept closer to clarity , It is the above , Space high frequency information , It's the details in a frame . A picture , There are many details in a video , Its clarity is high . The resolution determines the upper limit of high-frequency information ; Is how clear it can be .1080p The reason for this comparison is 480p good , Because it allows the image to record more high-frequency information . This statement seems to be very reliable , however , There are counter examples :
The high-frequency information on the right is much more than that on the left —— Its lines are very sharp , There is a lot of dense noise ( Note that the noise completely conforms to the definition of high-frequency information ; It makes the image change very fast )
But do you really think the picture on the right has high definition ?
in fact , The picture on the right is completely processed from the picture on the left . By over sharpening + Strong noise , Artificially adding invalid high-frequency information .
So the definition of clarity is more like this : In image or video , Native 、 Effective high-frequency information .
Native , Emphasize that this clarity is not artificially added ; It works ; It makes sense to emphasize details , Instead of meaningless noise effects .
It is worth mentioning that , The artificially increased high-frequency information may not be completely unhelpful . Sometimes moderate sharpening can really achieve a good visual effect :
This is a moderately sharpened effect . If someone thinks the picture on the right is better , At least some parts are better , believe me , You're not alone . So moderate sharpening is still in video and image processing , An acceptable means of subjective adjustment , On certain occasions , It really helps to improve the visual effect .
The above is an overview of clarity . Be careful , Clarity is only spatial ( Within one frame ). If we consider whether the dynamic effect is excellent or not ( Is it the kind of video that sticks together when it moves , Or it feels obvious when moving , It is common to get up early RMVB), Excellent viewing effects in space and time jointly define the image quality . So we say madVR/svp Those frame doubling effects help improve the image quality , In fact, they enhance the viewing effect in time .
Good picture quality , It is the common pursuit of producers and audiences . What kind of video will have good image quality ? Is it true that the higher the bit rate, the better the video quality ? I don't think so . The picture quality of the video , It is decided by the following points :
1、 The image quality of the source .
As the saying goes , The upper beam is not right and the lower beam is crooked . If the image quality of the source itself is poor , Then no matter how much you toss, don't expect the picture quality to be good . Therefore, suppressors often choose better sources to suppress —— Take a chestnut ,BDRip Generally than TVRip Good to come , Even if it is 720p. Blu ray also sells in different regions , Generally, the Japanese version sold in Japan , The picture quality is better than that of the American version 、 Taiwan version 、 The Hong Kong version is good , So it's also BDRip, Choose a better source , We can give priority to image quality .
2、 Playback conditions .
Whether the audience has used enough hardware and software to support high-quality playback . This is why we are releasing Rip At the same time, vigorously popularize good players ; Sometimes a good player is better than how much energy is invested in production .
3、 Bit rate input vs Coding complexity .
Time and space complexity of video , It is also called coding complexity . Video with high coding complexity , There are often many details , Dynamic high ( such as 《 Magic girl small amphitheater version Rebellious tale 》), Such videos naturally require a high bit rate to maintain an excellent viewing effect .
contrary , Some videos have low coding complexity ( such as 《 Would you like some rabbits today 》, Less dynamic , Soft line details ), This kind of video saves bit rate .
4、 Efficiency and rationality of bit rate allocation .
Same bit rate , What good effect can it have , It is called efficiency . such as H264 Than before RealVideo Efficient ;10bit Than 8bit Efficient ; The encoder is advanced , The parameter setting is reasonable , All high-end parameters of encoder are fully open ( Usually at the cost of coding time ), The bit rate efficiency is high .
Rationality is whether the bit rate is reasonable in terms of space-time allocation , A reasonable distribution , The viewing effect for the audience is relatively unified and coordinated . Efficiency and rationality of bit rate allocation , It's a requirement for producers , Ask the producer to analyze the film source , Have a good understanding of parameter settings .
One more word here , At least at this point in time , This article is also published 2014 By the end of year ,HEVC be relative to AVC Can improve 50% The efficiency of , It is still a theoretical value on paper . In practice , because HEVC The maturity of encoder is far less than that after more than ten years of development AVC Encoder , Cause now HEVC The potential of is far from being realized , Especially under high image quality, it is not even as good as .
For the current mainstream , Locate and collect image quality BDRip, At the same bit rate x265 Compared with x264 No advantage ; So in the near future , You don't have to download it first HEVC Edition for collection purposes , Don't be superstitious “ Reduce the bit rate by half ”. Again , At this point in time ; If the above statement is continuously improved after a year HEVC Encoder overturns , I'm not surprised . For example, at present 4K Began to use the adaptation code .
边栏推荐
- 登录进oracle10g的oem,想管理监听程序却总是弹出帐号密码输入页面
- Shell -- first day homework
- Canvas drawing 1
- shell---循环语句练习
- Esxi community network card driver updated again
- Es6--- > arrow function, class, modularization
- Starting point Chinese website font anti crawling technology web page can display numbers and letters, and the web page code is garbled or blank
- easypoi导出表格带echars图表
- Esxi community network card driver updated in March 2022
- 静态和浮动路由
猜你喜欢

Easypoi export table with echars chart

A timed task reminder tool

Shell script - "three swordsmen" awk command

Generate create table creation SQL statement according to excel

PXE无人值守安装管理

Use powercli to create a custom esxi ISO image

MOOC Weng Kai C language fourth week: further judgment and circulation: 3. Multiple branches 4. Examples of circulation 5. Common errors in judgment and circulation

DOM Foundation

kali下安装nessus

Servlet
随机推荐
[learning notes] knowledge management
Event_ Loop event loop mechanism
NAT network address translation
DHCP service
Blue bridge code error ticket
Generate create table creation SQL statement according to excel
shell---循环语句练习
Sysevr environment configuration: joern-0.3.1, neo4j-2.1.5, py2neo2.0
三层交换和VRRP
JS string method Encyclopedia
Shell script -- program conditional statements (conditional tests, if statements, case branch statements, echo usage, for loops, while loops)
Construction of Yum warehouse
metasploit渗透ms7_010练习
主动扫描技术nmap详解
PXE unattended installation management
登录进oracle10g的oem,想管理监听程序却总是弹出帐号密码输入页面
MOOC翁恺C语言第八周:指针与字符串:1.指针2.字符类型3.字符串4.字符串计算
开虚拟机KALI2022.2下安装GVM
Standard C language learning summary 3
Results fill in the blanks carelessly (violent solution)