When customizing and developing the video conference system , Some customers need to record the video conference process . A video conference is attended by multiple users , Each user has their own video and sound . To record a video conference, you need to record their video and sound to a mp4 In file .
This is the time , In video, it involves image synthesis , In terms of sound, it involves mixing . So called mixing , It is to calculate the multi-channel sound data through the mixing algorithm , Synthesize one output . The schematic diagram is as follows :

Some video conference system projects require recording at the client , Others need to be recorded on the server , In both cases , There will be differences in the mixer used .
OMCS Two mixer components are provided :AudioInOutMixer and MicrophoneConnectorMixer, It is used to support these two situations respectively .
AudioInOutMixer Used when recording on the client ,MicrophoneConnectorMixer Used when recording on the server .
One . Record a video conference on the client
During client recording , Generally, you need to record the voice and video sessions that the current user participates in . OMCS.Passive.Audio.AudioInOutMixer The function of is to mix the input data of local microphone equipment and the output data of local sound playback .
AudioInOutMixer The interface definition is as follows :
public classIAudioInOutMixer { /// <summary> /// Mix one frame of audio data collected by local microphone equipment and one frame of output data played by local speakers .( Audio data length :10ms) /// </summary> event CbGeneric<byte[]> AudioMixed; /// <summary> /// initialization /// </summary> /// <param name="mgr"></param> void Initialize(IMultimediaManager mgr); /// <summary> /// Release mixer . /// </summary> void Dispose(); }
(1) call Initialize After initializing the mixer , Mixer transformer starts to work normally .
(2)AudioMixed Every time 10ms Trigger once , Every output 10ms Mix data for .
(3) When it's done , Need to call the mixer Dispose Method to release the mixer .
Two . Record video conference on the server
Recording on the server is very different from recording on the client , The difference lies in :
(1) Generally, only one session needs to be recorded on the client , That is, the session in which the currently logged in user participates . And on the server side , It is often necessary to record multiple sessions at the same time .
(2) Client recording , Just mix the sound of the microphone with the sound played by the speaker , That's all .
When recording on the server , You need to get the sound data of all users participating in the target session for mixing . and , Also consider the situation where the user dynamically joins or exits the target session .
When recording on the server , Every recording task requires new A corresponding OMCS.Passive.Audio.MicrophoneConnectorMixer .
MicrophoneConnectorMixer Used to transfer multiple MicrophoneConnector Sound data for mixing .
MicrophoneConnectorMixer The interface is defined as follows :
public class MicrophoneConnectorMixer { /// <summary> /// every other 20 Millisecond trigger once , Output mix data . Parameters : The loudest speaker UserID - data. /// If no one speaks at this time , be UserID Parameter is null,data Mute data for . /// </summary> event CbGeneric<string, byte[]> AudioMixed; /// <summary> /// Add to the mix MicrophoneConnector. /// </summary> void AddMicrophoneConnector(MicrophoneConnector mc); /// <summary> /// Remove the MicrophoneConnector. /// </summary> void RemoveMicrophoneConnector(string ownerID); /// <summary> /// Release mixer . /// </summary> void Dispose(); }
(1) call AddMicrophoneConnector、RemoveMicrophoneConnector Users can be added and removed dynamically .
(2) Please pay special attention : The mixer is just from MicrophoneConnector Get sound data , It will not be called BeginConnector or Disconnect Method .
MicrophoneConnector You must connect successfully , Only called AddMicrophoneConnector Add it to the mixer .
(3) When used , Remember to call Dispose Method to release the mixer .
3、 ... and . Mixer optimization
During the actual use of the mixer , In order to achieve the best mixing effect , There are also some areas that can be optimized .
(1) When many people are speaking at the same time , If you add all the sounds to the mix , As one can imagine , The result of mixing is chaos .
In this case , We can only mix the speech at the highest volume 1~3 personal .
(2) In Tencent Video Conference , It has a very humanized function , When someone speaks ( Or when the volume is maximum ), Its video image will be enlarged , To focus the user's attention on the speaker .
Both of these optimizations are implemented in the mixer , The underlying implementation principle is roughly as follows :
(1) Put multiple voice frames before mixing , First, calculate the decibel value of each frame separately .( The decibel value of sound can be calculated by Fourier transform )
(2) Sort the calculated multiple decibel values , From large to small .
(3) Only the top one with the highest decibel value 1~3 The voice frames are submitted to the mixing algorithm .
(4) When outputting the mixing result , At the same time, the user with the largest decibel value ID Also output .
By setting IMultimediaManager Of Advanced Of AudioMixedStrategy attribute , You can specify the first few mixes that need to have the highest decibel value .
/// <summary> /// Mixing strategy . /// </summary> public enum AudioMixedStrategy { /// <summary> /// As long as there is sound data line, Are involved in mixing . /// </summary> All = 0, /// <summary> /// Only the highest decibel of the mix line. /// </summary> DecibelTop1, /// <summary> /// Only mix the top two decibel values line. /// </summary> DecibelTop2, /// <summary> /// Only mix the top three decibel values line. /// </summary> DecibelTop3 }
We'll see MicrophoneConnectorMixer Of AudioMixed event , It not only outputs mix data , It also outputs the voice of the user with the highest volume ID.
After the above optimization , The data output from the mixer is very easy to use , It can meet the actual recording needs of the current video conference project .







![[730. statistics of different palindrome subsequences]](/img/88/2a64eb20899ccc428f6011eb7ef72e.png)
