当前位置:网站首页>Million message IM system technical points sharing

Million message IM system technical points sharing

2022-06-23 23:11:00 wecloud1314

If we look closely, we can find , Any type of Internet service in life has IM The existence of the system .

such as :

    1) Basic services - Tencent news ( Comment news );

    2) Business applications - nailing ( Approve workflow notifications );

    3) Communication and entertainment -QQ/ WeChat ( Private chat, group chat & Discussion groups & Circle of friends );

    4) Internet we media - Tiktok Kwai ( Point like reward notice ).

 

In these numerous Internet ecological products , Instant messaging system as the underlying capability , In ensuring normal business and user experience optimization , Has always played a vital role .

therefore , In today's Internet products , Instant messaging technology is not limited to traditional IM The chat tool itself , It has already been embedded into various forms of Internet applications through tangible or intangible ways .IM technology ( Or instant messaging ) For many developers , It is indeed a necessary and indispensable domain knowledge , Indispensable .

Typical IM The system usually needs to meet four capabilities : high reliability 、 High availability 、 Real time and order .

For my project , The key points of architecture design are :

    1) Microservices : Split into user micro Services & Message connection service & Messaging services ;

    2) Storage architecture : Compatibility performance and resource overhead , choice reids&mysql;

    3) High availability : It can support high concurrency scenarios , choice Spring Provided websocket;

    4) Support multi terminal message synchronization :app End 、web End 、 WeChat official account 、 Applet messages ;

    5) Support online and offline message scenarios .

Understand read diffusion and write diffusion

Let's give an example of what read diffusion is , What is write diffusion :

A group chat “ Love each other and family ”, member : Dad 、 Mom 、 brother 、 My sister and I ( common 5 people ).

Because you have a girlfriend recently , So I sent a message “ I took off the order ” Go to the group , Then naturally, I hope my parents, brothers and sisters can receive it .

Under normal logic , The process of sending group chat messages should be like this :

    1) Traverse the members of the group chat and send a message ;

    2) Query the online status of each member ;

    3) Storage offline when members are not online ;

    4) Real time push of members online .

The problem lies in : If the first 4 An exception occurred in step , Group friends will lose messages , Then it will lead to families who don't know “ You took off the order ”, Cause serious consequences of expediting marriage .

So the optimized scheme is : Whether the group members are online or not , You have to store messages first .

According to the above thinking , The optimized group message flow is as follows :

    1) Traverse the members of the group chat and send a message ;

    2) Group chat everyone saves a copy ;

    3) Query the online status of each member ;

    4) Online real-time push .

The above optimized scheme , It's called “ Write spread ” 了 .

The problem lies in : Everyone saves the same “ You took off the order ” The news of , It wastes a lot of disk and bandwidth ( This is the biggest disadvantage of write diffusion ).

So the optimized scheme is : The group message entity stores a , Users only save messages ID Indexes .

So the optimized sending group message flow is as follows :

    1) Traverse the members of the group chat and send a message ;

    2) Save a message entity first ;

    3) Then group chat everyone saves a message entity ID quote ;

    4) Query the online status of each member ;

    5) Online real-time push .

The scheme after secondary optimization , It's called “ Reading diffusion ” 了 .

To sum up :

    1) Reading diffusion : The read operation is heavy , The write operation is very light , Resource consumption is relatively small ;

    2) Write spread : The read operation is very light , The write operation is heavy , Resource consumption is relatively large .

Message entity model :

Common messaging services , It can be abstracted into several entity model concepts : user / User relationship / User equipment / User connection status / news / Message queue .

Entity model concept explanation :

User entity :

    1) user -> User terminal equipment : Each user can log in and send and receive messages at multiple terminals ;

    2) user -> news : Considering read diffusion , The relationship between each user and the message is 1:n;

    3) user -> Message queue : Considering read diffusion , Each user will maintain their own “ Message list ”(1:1), If expansion is considered , You can even open up a message overflow list to receive more than “ Message list ” Capacity message data ( Now it's 1:n);

    4) user -> User connection status : Considering that users can log in multiple terminals , that app/web There will be corresponding online status information (1:n);

    5) user -> Contact relationship : Considering that users are eventually connected by some kind of business , Form multiple contact relationships , Finally form private chat or group chat (1:n);

Contact relationship ( The relationship between users is mainly determined by the business ), for instance :

    1) How many people in a family , How many people there are in this family group ;

    2) stay ToB scene , In nail enterprise , We often have business groups to talk about this existence .

Message entity :

news -> Message queue : Considering read diffusion , Messages eventually belong to one or more message queues , Therefore, in the group chat scenario, it will be distributed in different message queues .

Message queuing entity :

Message queue : To be exact, message reference queue , The index element in it finally points to the specific message entity object .

User connection status :

    1) about app End : Disconnection due to network reasons , Or the user can manually kill Drop the application process , Are offline ;

    2) about web End : The browser is disconnected due to network reasons , Or the user manually closes the tab , Are offline ;

    3) For the public number : Cannot be offline or online separately ;

    4) For applets : Cannot be offline or online separately .

User terminal equipment :

The client is usually Android&IOS,web The client is usually a browser , There are other flexible WebView( official account / Applet ). Instant messaging development

 

Message storage scheme

For message storage schemes , There are essentially only three options : Or put it in memory 、 Or put it on disk 、 Or a combination of the two ( It is said that large companies in order to optimize performance , Active message data is stored in memory , After all, money ~).

The advantages and disadvantages of the main schemes are analyzed below :

    1) Scheme 1 : Consider performance , Put all the data in redis For storage ;

    2) Option two : Consider resources , The data used redis + mysql For storage .

For scheme one :redis

Premise : user & Contact relationship , Because it's business data , Therefore, relational database storage is used by default .

Explain the following :

    1) User sends message ;

    2)redis Create an entity data & An entity data timer ;

    3)redis stay B The user's user queue Add entity data reference ;

    4)B User pull message ( follow-up 5.2 Pull mode will be mentioned ).

Implementation scheme :

    1) User queues ,zset(score Ensure order );

    2) List of message entities ,hash(msg_id Make sure it's unique );

    3) Message entity counter ,hash( Support the number of references of group chat messages , When the countdown reaches zero, the corresponding message of the entity list will be deleted , To save resources ).

Advantage is : Memory operations , Good response performance

The disadvantage is :

    1) Memory consumption is huge ,eg: Except for big factories , The precious memory resources of servers in small companies can't afford to consume business , As the business grows , Don't want to expand resources , You need to clean up the data manually ;

    2) suffer redis Disaster recovery strategy has a great impact , If redis Downtime , Directly lead to data loss ( have access to redis Cluster deployment of / Sentinel mechanism / Master-slave replication and other means to solve ).

Option two :redis+mysql

Premise : user & Contact relationship , Because it's business data , Therefore, relational database storage is used by default .

Explained as follows :

    1) User sends message ;

    2)mysql Create an entity data ;

    3)redis stay B The user's user queue Add entity data reference ;

    4)B User pull message ( The pull mode will be mentioned below ).

Implementation scheme :

    1) User queues ,zset(score Ensure order );

    2) List of message entities , Transferred to the mysql( Table primary key id Make sure it's unique );

    3) Message entity counter ,hash( Delete this concept , Because the total available disk resources are much higher than the total memory resources , Even if it has been stored mysql database , At the level of millions of business, there will be no big problem , If it is a huge volume business, we need to consider the performance of processing and retrieving data by table and database ).

Advantage is :

    1) The message entity with the largest amount of data , Greatly saves memory resources ;

    2) Disk resources are easy to expand , Cheap and practical .

The disadvantage is : Disk read operation , Poor response performance ( From the perspective of product design , The set you maintain IM How strong is the system IM Or weak IM).

原网站

版权声明
本文为[wecloud1314]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/174/202206231902418779.html