当前位置:网站首页>How to realize real-time audio and video chat function

How to realize real-time audio and video chat function

2022-07-05 04:01:00 wecloud1314

It's all around , The threshold of real-time audio and video chat technology is very high , It is very difficult for ordinary companies to make up for the technical shortcomings in this field in a short time , And the open source audio and video project WebRTC Provides such a shortcut ( Including the products of the author's company , Again based on WebRTC Technology to achieve ).

This paper will be based on the online consultation product developed by the author company WebRTC Practical experience of Technology , How the story is based on WebRTC Develop a real-time audio and video chat function from zero . The article will be from WebRTC Basic knowledge of 、 Start with technical principles , Based on open source technology, it shows you how to build a WebRTC Real time audio and video chat function .

 

WebRTC(Web Real-Time Communication) yes Google stay 2010 In the past years 6820 Million dollars for VoIP Software developers Global IP Solutions Of GIPS engine , And changed his name to “WebRTC” On 2011 It was open-source in, aiming to establish a platform for real-time audio, video and data communication between Internet browsers .

that WebRTC What can I do ?

Except for the wechat we all use every day 、 nailing 、qq This kind of traditional IM In addition to real-time audio and video calls in Social Software , The author's company's products involve online consultation in the medical field / Remote clinic / remote consultation , There is also a popular interactive live broadcast 、 Online education and other scenes . besides , With 5G The rapid construction of ,WebRTC It also provides good technical support for cloud games .

Whole WebRTC It can be roughly divided into the following 3 part :

    1) Purple is offered to Web Used in front-end development API;

    2) The blue solid line part provides information used by major browser manufacturers API;

    3) The blue dotted line contains 3 part : Audio engine 、 Video Engine 、 Network transmission (Transport), Can be customized .

 P2P Technical difficulties of communication

P2P Communication is point-to-point communication .

To implement two different network environments ( With microphone 、 Camera equipment ) The client of ( It could be different Web Browser or mobile phone App) What are the difficulties of real-time audio and video communication between 、 What problems need to be solved ?

To sum up , Mainly the following 3 A question :

    1) How to know each other's existence is how to find each other ?

    2) How to communicate with each other's audio and video codec capabilities ?

    3) How to transmit audio and video data , How can you let the other party see themselves ?

How to know each other's existence ( That is, how to find each other )?

For questions 1:WebRTC Although it supports end-to-end communication , But that doesn't mean WebRTC No more servers .

stay P2P In the process of communication , The two sides need to exchange some metadata, such as media information 、 Network data and other information , We usually call this process “ Signaling (signaling)”. Instant messaging chat software app Development can be instant messaging development consulting plus V:weikeyun24

 

The corresponding server is “ Signaling server (signaling server)”, It is often called “ Room server ”, Because it can not only exchange each other's media information and network information , You can also manage room information .

such as :

    1) Inform each other who Joined the room ;

    2)who Left the room

    3) Tell the third party whether the room is full and whether you can join the room .

To avoid redundancy , And maximize the compatibility with existing technologies ,WebRTC The standard does not specify signaling methods and protocols . The practice section later in this article will use Koa and Socket.io Technology to implement a signaling server .

How to communicate with each other's audio and video codec capabilities ?

For questions 2: The first thing we need to know is , Different browsers have different encoding and decoding capabilities for audio and video .

such as : Peer-A Client support H264、VP8 And so on , and Peer-B Client support H264、VP9 Equiform . In order to ensure that both sides can encode and decode correctly , The simplest way is to take the intersection of the formats they all support -H264.

stay WebRTC in : There's a special agreement , be called Session Description Protocol(SDP), It can be used to describe the above information .

therefore : Both parties involved in audio and video communication want to know the media formats supported by each other , It has to be exchanged SDP Information . And exchange SDP The process of , It's often called media negotiation .

How to transmit audio and video data , How can you let the other party see themselves ?

For questions 3: Its essence is the process of network negotiation , That is, both parties involved in audio and video real-time communication should understand each other's network situation , In this way, it is possible to find a link to communicate with each other .

The ideal network situation is that each browser's computer has its own private public network IP Address , In this way, point-to-point connection can be carried out directly .

But actually : For network security and IPV4 Address is not enough to consider , Our computers and computers, large or small, are in a local area network , need NAT(“Network Address Translation,” Chinese translated into “ Network address translation ”). stay WebRTC We use ICE Mechanism to establish network connection .

So what is ICE?

    ICE (Interactive Connecctivity Establishment, Interactive connection building ),ICE It's not an agreement , It's about integrating STUN and TURN The framework of the two protocols .

among :STUN(Sesssion Traversal Utilities for NAT, NAT Session traversal application ), It's allowed to be in NAT( Or multiple NAT) The client finds its corresponding public network IP Address and port , It's also known as P2P“ Hole digging ”.

however : If NAT If the type is symmetrical , Then you can't make a hole successfully . Now TURN That comes in handy ,TURN(Traversal USing Replays around NAT) yes STUN/RFC5389 An extension of the protocol adds Replay( relay ) function .

Simply speaking : The purpose is to solve the problem of symmetry NAT An impassable problem , stay STUN Distribute the public network IP After failure , Can pass TURN The server requests the public network IP Address as relay address .

stay WebRTC There are three types of ICE candidates , They are :

    1) Host candidates : It represents... In the local LAN IP Address and port . It is the highest priority of the three candidates , That is to say WebRTC Bottom , First, I will try to establish a connection within the local LAN ;

    2) Reflex candidates : It means to get NAT External network of internal host IP Address and port . Its priority is lower than Host candidates . That is to say, when WebRTC When trying to connect locally , Will try to get through the reflection candidate IP Address and port to connect ;

    3) Relay candidates : Represents the of the relay server IP Address and port , That is, transfer media data through the server . When WebRTC Both sides of client communication cannot cross P2P NAT when , In order to ensure normal communication between the two sides , At this time, the service quality can only be guaranteed through server transfer .

In a non local LAN WebRTC adopt STUN server Get your own extranet IP And port , Then it communicates with the remote server through the signaling server WebRTC Exchange network information , Then both sides can try to establish P2P Connected to . When NAT When the crossing is unsuccessful , Will pass Relay server (TURN) transit .

It is worth mentioning that : stay WebRTC Network information is usually used in candidate To describe , And in the above figure STUN server and Replay server It can also be the same server. In the practical chapter at the end of the paper, the integrated STUN( Hole digging ) and TURN( relay ) Open source project with functions coturn.

In short : It is through WebRTC Provided API Get the media information at each end SDP as well as Internet Information candidate , And exchange through signaling server , Then the connection channels at both ends are established to complete the real-time video and voice call .

原网站

版权声明
本文为[wecloud1314]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/186/202207050359471437.html