当前位置:网站首页>Audio and video engineer (Preliminary) (I) basic concepts of audio and video
Audio and video engineer (Preliminary) (I) basic concepts of audio and video
2022-06-12 08:31:00 【Pry the fulcrum of the future】
1. Preface
This is the first article in the audio and video engineer series .
In the process of learning audio and video, I found , There are not many high-quality audio and video technology blogs on the Internet . The representative is
- Raytheon's Lei Xiaohua's CSDN Blog
Because I am an ordinary software developer , I don't like high-ranking or obscure theories , We always like things that are easy to understand . I think knowledge itself should also be easy to understand . Knowledge is described by obscure things , It is an obstacle to human progress , It is a knowledge monopoly .
Many audio and video standards are formulated by foreign countries , And many audio and video development base libraries are also foreign , There is a technological gap between China and foreign countries .
The author hopes that this series of tutorials can take engineering implementation as the starting point and foothold , Simplify theoretical knowledge , Analyze audio and video technology from complexity to simplicity , For the development of audio and video technology , Make a small contribution to the growth of your readers as audio and video engineers .
2. Audio and video data content information classification
Audio and video data content is essentially , Divided into two :
- Audio and video data : Store the information that the audio and video itself wants to convey , Sound and image .
- Audio and video parameters : Store control parameters of audio and video . For example, sampling rate , Frame rate and other information , It is an essential information for processing audio and video data .
3. Format
Package format ( Containers )
Audio and video content includes audio and video data and audio and video parameters , The format in which these two kinds of information are packaged in one file , It is called encapsulation format , It's also called a container . Personally, I think it is easier to understand the packaging format .
at present , Many video file formats have been invented . Of course, the invention of these formats is not a pat on the thigh , But in a specific usage scenario , Corresponding video format , It can store and process audio and video data more efficiently .
Because the encapsulation format is basically handled by the library in the development , There is no need for us to encapsulate and parse , So we can not make a detailed understanding . The first task at the beginning of learning is to grasp the overall working framework of audio and video , Some technical details can be put aside for a while , I'll study it later . This is also the learning method suggested by the author : The distribution of learning energy should follow the 28 law , In most areas , Only about 20% Knowledge of is the most important , Here 20% We need to spend 80% Time for , be left over 80% Not so important , Just flowers 20% It's just a matter of time .
Coding format ( Compress )
The audio and video data without compression is very large . for example :
1920x1080 Of RGB24 Images ,1 second 25 frame , Uncompressed ,1 The total size of the video is about 1920 * 1080 * 3 * 25 * 60 * 60 = 521G.
In this case ,500G Computer hard disk , At most 1 An hour's movie ; Swiping small videos will become too laggy ; Baidu SkyDrive 128KB/s It takes a month or two to download a movie …
therefore , Audio and video Must be compressed and stored or transmitted .
In a coding format ( Compression format ) Compress audio and video data , The compressed audio and video data and audio and video parameters are packaged in a certain format , Packed into audio and video files , This generates a video file .
4. Basic unit of audio and video data
4.1 sampling / Sampling
First of all, understand sampling / The concept of sampling .
Video sampling means : Take continuous pictures of objects at a certain speed , Constantly record the picture of an object at a certain moment .
Audio sampling means : Continuously record the vibration amplitude of sound at a certain instant at a certain speed .
Sampling is a means of reflecting the original things . For sampled audio and video data , The computer can play and process .
4.2 Video frame
The picture taken at every moment , It's called video frame , It is the basic data unit of video .
4.3 PCM Audio sampling point
The sound amplitude obtained at each moment , Call PCM Sampling point , It is the basic data unit of audio .
It's enough to understand these , There is no need to delve into too many details . The most important thing is to understand , Real world things can be reflected in the computer by sampling , Become processable data .
5. Audio and video parameters
Since the audio and video data are sampled , So we want to play audio and video data , You must use some parameters used in the sampling process , In this way, the audio and video data can correctly reflect the state of the original thing .
5.1 Audio parameters
5.1.1 The basic parameters
Audio has three basic parameters :
Number of sampling bits : Indicates how many bits are used to save each sample , It's usually 4、8、16、32 position (bit). The number of sampling bits is 8 bit when , Each sampling point can represent 256 Different sampling values , The sampling bits are 16 bit when , Each sampling point can represent 65536 Different sampling values . The number of sampling bits affects the quality of sound , The more sampling bits , The closer the quantized waveform is to the original sound , The higher the quality of the sound , But the more storage space you need ; The fewer digits , The lower the quality of the sound , The less storage space you need . Usually ,CD The sampling bits of sound quality are 16 bit, Mobile communication is 8 bit.
sampling frequency (Sampling Rate): Indicates how many samples are taken in one second , The commonly used sampling frequency is 44100=44.1k, as well as 44800=44.8k. The sampling frequency must be at least of the frequency that people can hear 2 times , This is obtained from Nyquist sampling theorem , If the sampling frequency is lower , The frequency in the original sound will be lost . Nyquist sampling theorem is one of the most important principles in the course of signal and system , It doesn't matter if you don't understand , No need to know too much .
Track number : The number of channels refers to how many recording microphones are placed around the sound source for recording . The channels are independent 、 Sampling at the same time . Most music is two channel , Listen with headphones while playing , There will be a certain three-dimensional feeling .
The above three parameters are the three most basic parameters , As long as you know these three parameters , You can record the sound , And it can be played .
How to memorize these three parameters ? From small to large 、 From less to more logical memory :
- Start with a single sampling point , The number of bits used to store a point is called sampling bits ;
- The number of sampling points obtained by sampling in one second is called the sampling frequency ;
- The number of simultaneous sampling tasks is called the number of channels .
5.1.2 Network parameters
Why should we talk about network parameters , Because of the development of Internet , Audio and video become a service , Gradually appear on the Internet . Watch Movies Online , Short video , Online classes have become a new way of life . Move audio and video onto the Internet , There must be some technical parameters related to the network , To control the audio and video services . There are the following :
- Bit rate : The unit is Bits per second , English is bps(bit per second). This parameter indicates how much network bandwidth is required to play an audio . Because the bit rate is a parameter in the network environment , So we should combine the network to understand . Network bandwidth is the network speed , The unit is bps. We have broadband at home , Generally, the bandwidth is 100 megabytes , Sounds like it's fast , But because its unit is bps, Actually converted to bytes (Byte), Divide by 8, A hundred trillion 12.5MB/s, Gigabit is just 125MB/s Per second . Because files are measured in bytes , So the speed is not as fast as it sounds . therefore , Bit rate is a measure of when audio is uploaded and transmitted on the network , A parameter for network bandwidth usage . When the network bandwidth is small , You can choose to lower the video bit rate , This can be achieved by reducing the audio quality ; When the network bandwidth is large , You can choose to increase the audio bit rate , Transmit higher quality audio and video .
5.2 Video parameters
5.2.1 The basic parameters
Video has three basic parameters :
- Pixel format : Pixels are used to save the color of a point , This color can be stored in different formats . Common are RGB、YUV、HSV etc. . Learning these formats requires some mathematical operations , We will write a separate article to explain , And plan to write a tool to realize the transformation between them .
- Screen resolution : Resolution refers to the number of pixels in the length and width of a frame , The general way of writing is wide * high , Such as 1920 * 1080. The higher the resolution , The finer the picture .
- Frame rate : Frame rate refers to 1 The number of frames played per second , The higher the frame rate , The smoother the picture , The lower the frame rate , The more the picture gets stuck .
Basic video parameters , You can also use logical memory from small to large :
- Start with the most basic pixels , The format in which a color is stored is called pixel format .
- n Pixels make up a frame , The length and width of a frame is called resolution .
- n A video can only be formed by playing consecutive frames , Frame rate refers to 1 Number of frames played in seconds .
5.2.2 Network parameters
The network parameters of video are similar to those of audio , There are mainly :
- Bit rate : Explain the bit rate of the same audio . It needs to be mentioned that , The bit rate level of video is higher than that of audio , Because video data is relative to audio data , More than a little bigger , So there are more video bit rate grades .
Conclusion
This article mainly combs the basic knowledge that must be understood in audio and video . The positioning of this series of articles is to be concise and easy to understand . Everyone is busy in modern society , Minimize learning costs , Is what a good tutorial should do , Instead of complicating a simple thing with some technical terms and formulas . The following articles will be updated in official account and websites , I hope you will pay more attention .
Appendices and reference links
- Raytheon blog link : Introduction to video and audio data processing :PCM Audio sampling data processing
- PCM Baidu Encyclopedia :https://baike.baidu.com/item/PCM/1568054
- Audio Basics —PCM elementary analysis :cloud.tencent.com/developer/article/1802685
This article is originally published in WeChat official account. Qt Future Engineer .
边栏推荐
- 只把MES当做工具?看来你错过了最重要的东西
- What is an extension method- What are Extension Methods?
- 深拷贝与浅拷贝的区别
- Install iptables services and open ports
- Why should enterprises implement MES? What are the specific operating procedures?
- Centos8 installing MySQL 8.0 (upper)
- Easyexcel exports excel tables to the browser, and exports excel through postman test [introductory case]
- You get download the installation and use of artifact
- Beidou satellite navigation system foundation part 1
- Instructions spéciales pour l'utilisation du mode nat dans les machines virtuelles VM
猜你喜欢

Regular expressions in JS

ctfshow web4

JVM learning notes: garbage collection mechanism

(p15-p16) optimization of the right angle bracket of the template and the default template parameters of the function template

【数据存储】浮点型数据在内存中的存储

Webrtc series - mobile terminal hardware coding supports simulcast

ctfshow web3

The era of post MES system has come gradually

Centos8 installing MySQL 8.0 (upper)

X64dbg debugging exception_ ACCESS_ VIOLATION C0000005
随机推荐
进制GB和GiB的区别
A brief summary of C language printf output integer formatter
What kind of sparks will be generated when the remote sensing satellite meets the Beidou navigation satellite?
Hands on learning and deep learning -- simple implementation of softmax regression
JVM学习笔记:垃圾回收机制
企业为什么要实施MES?具体操作流程有哪些?
MSTP的配置与原理
记Date类型的一次踩坑
【指针进阶三】实现C语言快排函数qsort&回调函数
MYSQL中的锁的机制
Model compression | tip 2022 - Distillation position adaptation: spot adaptive knowledge distillation
(P13)final关键字的使用
Convolutional neural network CNN based cat dog battle picture classification (tf2.1 py3.6)
Configuration and principle of MSTP
Hands on deep learning -- image classification dataset fashion MNIST
MYSQL中的触发器
ctfshow web 1-2
Lock mechanism in MySQL
Call method and apply method
ctfshow web3