当前位置：网站首页>Transport layer protocol -- TCP protocol

Transport layer protocol -- TCP protocol

2022-06-12 09:51:00 【2021dragon】

TCP agreement

Talk about reliability

TCP Its full name is “ Transmission control protocol （Transmission Control Protocol）”,TCP Protocol is the most widely used transport layer protocol in the Internet nowadays , Not one of them. .

TCP The protocol is widely used , The fundamental reason is that it provides detailed reliability guarantee , be based on TCP There are many upper layer applications of , such as HTTP、HTTPS、FTP、SSH etc. , even to the extent that MySQL It's also used at the bottom TCP.

Why is there unreliability in the network ？

Most modern computers are based on von Neumann architecture .
Insert picture description here
Although the input device here 、 Output devices 、 Memory 、CPU It's all on one machine , But these hardware devices are independent of each other . If there is data interaction between them , You have to find a way to communicate , So these devices actually use “ Line ” Connected , Which connects between memory and peripherals “ Line ” be called IO Bus , And connect memory and CPU Between “ Line ” It's called a system bus . Because these hardware devices are all on the same machine , So here's where the data is transmitted “ Line ” It's very short , The probability of errors in data transmission is also very low .

But if the devices to communicate are thousands of miles apart , Then connect the devices “ Line ” Will become very long , The probability of errors in data transmission will also be greatly increased , At this time, ensure that the data transmitted to the opposite end is correct , Reliability must be introduced .

All in all , The root cause of unreliability in the network is , Used for long-distance data transmission “ Line ” Is too long. , Various problems may occur in the process of long-distance data transmission , and TCP It was against this background that it was born ,TCP It is a protocol to ensure reliability .

Thinking expansion ：

In fact, a single computer can be regarded as a small network , The various hardware devices on the computer are actually communicating with each other , And they must also comply with their respective communication protocols when communicating , But the communication protocol between them mainly describes the meaning of some data .

Why does it exist UDP agreement ？

TCP Protocol is a reliable transmission protocol , Use TCP The protocol can ensure the reliability of data transmission to a certain extent , and UDP Protocol is an unreliable transmission protocol , that UDP What is the point of such an unreliable agreement ？

Unreliable and reliable are two neutral words , They describe the characteristics of the protocol .

TCP Protocol is a reliable protocol , That means TCP The protocol needs to do more to ensure the reliability of the transmitted data , And the more factors that cause unreliability , Guaranteed reliable cost （ Time + Space ） The higher .
For example, packet loss occurs during data transmission 、 Disorder 、 Inspection and failure, etc , These are unreliable situations .
because TCP To solve the problem of unreliable data transmission , therefore TCP It must be better than UDP complex , And the maintenance cost is particularly high .
UDP Agreements are unreliable agreements , That means UDP The protocol does not need to consider the possible problems in data transmission , therefore UDP Both use and maintenance are simple enough .
It should be noted that , although TCP complex , but TCP Is not necessarily more efficient than UDP low ,TCP There is not only a mechanism to ensure reliability , There are also various mechanisms to ensure transmission efficiency .

UDP and TCP No one is the best , Only who is the most suitable , For network communication, the specific method is TCP still UDP It all depends on the application scenario of the upper layer . If the application scenario strictly requires the reliability of data in the transmission process , Then we have to adopt TCP agreement , If the application scenario allows a small amount of packet loss in data transmission , Then you must give priority to UDP agreement , because UDP The agreement is simple enough .

TCP Form of agreement

TCP The format of the agreement is as follows ：
Insert picture description here
TCP The meaning of each field in the header is as follows ：

Source / Destination port number ： Indicates which process the data comes from , To which process is sent to the peer host .
32 Bit number /32 Bit confirmation number ： Represent the TCP The number of each byte of data in the message and the confirmation of the other party , yes TCP Important fields to ensure reliability .
4 position TCP Header length ： It means that we should TCP The length of the header , With 4 Bytes are units .
6 Bit reserved field ：TCP Temporarily unused in the header 6 A bit .
16 Bit window size ： Guarantee TCP Important fields of reliability mechanism and efficiency improvement mechanism .
16 Bit test and ： Filled by sender , use CRC check . The receiver fails the verification , It is considered that there is a problem with the received data .（ Inspection and inclusion TCP The first one +TCP Data section ）
16 Bit emergency pointer ： Identify the offset of emergency data in the message , It is required to match the... In the flag field URG Fields are used uniformly .
Option fields ：TCP Additional option fields are allowed in the header , most 40 byte .

TCP In the header 6 Bit flag bit ：

URG： Is the emergency pointer valid .
ACK： Confirm whether the serial number is valid .
PSH： Prompt the receiving application to immediately TCP The data in the receive buffer is read away .
RST： Ask the other party to re-establish the connection . We carry RST The identified message is called a reset segment .
SYN： Indicates a request to establish a connection with the other party . We carry SYN The identified message is called a synchronous segment .
FIN： Inform the other party , This end is closing . We carry FIN The identified message is called the end segment .

TCP The header is essentially a bit segment type in the kernel , Encapsulate data TCP At the beginning , In fact, a variable is defined with this bit segment type , Then fill in TCP Each attribute field in the header , In the end, I put this TCP The header is copied to the header of the data , This completes TCP Header encapsulation .

TCP How to separate the header from the payload ？

When TCP Get a message from the bottom layer , although TCP Do not know the specific length of the header , But before the message 20 Bytes are TCP The basic header of , And this 20 Bytes contain 4 Bit header length .

therefore TCP This is how the header is separated from the payload ：

When TCP After getting a message , First read the front of the message 20 Bytes , And extract 4 Bit header length , At this point, you get TCP Header size $s i z e$ .
If $s i z e$ The value is greater than 20 byte , You need to continue reading from the message $s i z e - 20$ Bytes of data , This part of the data is TCP The option field in the header .
Finished reading TCP After the basic header and option fields of , All that's left is the payload .

It should be noted that ,TCP In the header 4 The basic unit of bit header length description is 4 byte , This also happens to be the width of the message .4 The value range for the length of the head is 0000 ~ 1111, therefore TCP The maximum header length is $15\times4=60$ byte , Because the length of the basic header is 20 byte , Therefore, the maximum length of the option field in the header is 40 byte .

If TCP There is no option field in the header , that TCP The length of the header is 20 byte , At this time, the 4 The value of the bit header length is $20\div4=5$ , That is to say 0101.

TCP How to decide which protocol to deliver the payload to the upper layer ？

Each network process in the application layer must be bound with a port number .

The server process must display and bind a port number .
The client process is dynamically bound with a port number by the system .

and TCP The destination port number is included in the header of , therefore TCP The destination port number in the header can be extracted , Find the corresponding application layer process , Then the payload is handed over to the corresponding application layer process for processing .

Explain ： The kernel maintains the port number and process by hash ID Mapping between , Therefore, the transport layer can quickly find its corresponding process through the port number ID, Then find the corresponding application layer process .

Serial number and confirmation serial number

What is really reliable ？

In network communication , Data sent by one party , It cannot guarantee that the data can be successfully received by the opposite end , Because various errors may occur during data transmission , Only after receiving the response message from the peer host , The host can ensure that the data sent last time is reliably received by the opposite end , This is called true reliability .
Insert picture description here

 Figure note ： The solid line indicates that the data can be reliably received by the other party , Dotted lines do not guarantee .

but TCP What should be ensured is the reliability of communication between both parties , Although at this time the host A It can ensure that the data sent last time is sent by the host B Reliably received , But the mainframe B You also need to ensure that you send it to the host A The response data is sent to the host A Reliably received . So the host A After receiving the host B After the response message , You also need to respond to the response data , But at this time, it is necessary to ensure that the host A Reliability of response data sent …, It's a dead circle .
Insert picture description here
Because only when one end receives the response message from the other , To ensure that the data sent last time is reliably received by the opposite end , But when the two sides communicate, there will always be the latest news , Therefore, 100% reliability cannot be guaranteed .

So strictly speaking , There is no 100% reliability in Internet communication , Because when the two sides communicate, there is always the latest message that can not be responded . But it is not necessary to guarantee the reliability of all messages , We just need to ensure that each core data sent by both parties during communication has a corresponding response . And for some unimportant data （ Such as response data ）, We don't have to guarantee its reliability . Because if the opposite end does not receive the response data , It will determine that the last message sent is lost , At this time, the opposite end can retransmit the last sent data .
Insert picture description here
This strategy TCP This is called the acknowledgement response mechanism . It should be noted that , The acknowledgement and response mechanism does not guarantee the reliability of all messages communicated by both parties , But as long as one party receives the reply message from the other party , It means that the data it sent last time was reliably received by the other party .

32 Bit number

If both parties are in data communication , The next data can only be sent after receiving the response of the last data sent , Then the communication process between the two sides is serial , Efficiency is conceivable .

Therefore, when the two sides conduct network communication , It allows one party to send multiple message data to the other party continuously , Just make sure that each message sent has a corresponding response message , At this time, it can also ensure that these messages are received by the other party .
Insert picture description here
But when multiple messages are sent continuously , Because the path selected by each message during network transmission may be different , Therefore, the order in which these messages arrive at the opposite host may be different from the order in which they are sent . But message ordering is also a kind of reliability , therefore TCP In the newspaper 32 One of the functions of bit sequence number is actually to ensure the order of messages .

TCP Each byte of data sent is numbered , This number is called the serial number .

For example, now the sender wants to send 3000 Bytes of data , If the sender sends each time 1000 byte , Then you need three TCP Message to send this 3000 Bytes of data .
Now these three TCP In the message 32 The bit sequence number is the sequence number of the first byte in the transmitted data , Therefore, please fill in 1、1001 and 2001.

Insert picture description here
At this time, the receiving end receives these three TCP After the message , It can be based on TCP In the header 32 The bit sequence number rearranges the three messages in sequence （ This action is performed at the transport layer ）, Rearrange it and put it in TCP In the receive buffer of , At this time, the sequence of messages sent by the receiving end is the same as that of messages sent by the sending end .

When the receiving end rearranges the message , According to the current message 32 The bit sequence number and the number of bytes of its payload , Then determine the sequence number corresponding to the next message .

32 Bit confirmation number

TCP In the header 32 The bit confirmation sequence number tells the opposite end , What data have I received so far , Where should your data be sent next time .

Take the example just now , When the host B Received the host A Sent by 32 The bit number is 1 The message of , Because the message contains 1000 Bytes of data , So the host B Received serial number 1-1000 Byte data of , So the host B Send it to the host A In the header of the response data 32 The value of the bit confirmation sequence number will be filled as 1001.

On the one hand, it tells the host A, The serial number is in 1001 I have received the previous byte data .
The other is to tell the host A, The next time you send me data, you should start with the serial number 1001 Start sending byte data of .

And then the mainframe B To host A When responding to other messages sent , Send it to the host A Among the responses of 32 The filling method for confirming the serial number is similar .
Insert picture description here
Be careful ：

The response data is the same as other data , It's also a complete TCP message , Although the message may not carry a payload , But at least one TCP Headlines .

What if the message is lost ？

Take the example just now , host A Three messages are sent to the host B, The payload of each message is 1000 byte , Of these three messages 32 The bit serial numbers are 1、1001、2001.

If the three packets lose packets during network transmission , Finally, only the serial number is 1 and 2001 The message of is sent by the host B received , So when the host B When reordering messages , You will find that you have only received 1-1000 and 2001-3000 Byte data of . At this time, the host B On the host A When responding , It responds to... In the header 32 The digit confirmation serial number is filled in 1001, Tell the host A The next time you send me data, you should start with the serial number 1001 Start sending byte data of .
Insert picture description here
Be careful ：

At this time, the host B Giving the host A Response time , Its 32 Bit confirmation serial number cannot be filled 3001, because 1001-2000 Is in 3001 Previous , If you give it directly to the host A Respond to 3001, It means that the serial number is 3001 All previous byte data has been received .
So the host B Only to the host A Respond to 1001, When the host A After receiving the confirmation serial number, it will be determined that the serial number is 1001 The packet of is lost , At this time, the host A Data retransmission can be selected .

Therefore, the sending end can confirm according to the serial number sent by the opposite end , To determine whether a message may be lost during transmission .

Why use two serial number mechanisms ？

If both sides of the communication only send data at one end , The other end receives data , Then just use a set of serial numbers .

When sending data, the sender , Consider this serial number as 32 Bit number .
When the receiving end responds to the data sent by the sending end , Consider this serial number as 32 Bit confirmation number .

But the actual TCP But didn't do it , The root cause is TCP It's full duplex , Both parties may want to send messages to each other at the same time .

Messages sent by both parties , Not only need to fill 32 Bit serial number to indicate the serial number of the data you are currently sending .
You also need to fill 32 Bit confirmation number , Confirm the last data sent by the other party , Tell the other party which byte sequence number should be sent next .

So it's going on TCP When communication , Both parties need to have a confirmation and response mechanism , At this point, a set of serial numbers can not meet the demand , Therefore need TCP Two sets of serial numbers appear in the header .

To sum up ：

32 The function of bit sequence number is , Ensure that the data arrive in order , At the same time, this serial number is also filled in when sending messages to the opposite end 32 Bit to confirm the sequence number .
32 The function of bit confirmation sequence number is , Tell the opposite end what byte data has been received currently , Which byte sequence number should the peer start sending data next time .
Serial number and confirmation serial number are the data representation of confirmation response mechanism , The confirmation response mechanism is guaranteed by the serial number and the confirmation serial number .
Besides , It is also possible to judge whether a message is lost through the serial number and confirmation serial number .

Window size

TCP Receive buffer and send buffer

TCP It has a receive buffer and a send buffer ：

The receive buffer is used to temporarily store the received data .
The send buffer is used to temporarily store data that has not been sent .
Both buffers are in TCP Implemented inside the transport layer .

Insert picture description here

TCP The data in the transmit buffer is written by the upper application layer . When the upper layer calls write/send When such a system calls an interface , In fact, the data is not sent directly to the network , Instead, the data is copied from the application layer to TCP In the send buffer of .
TCP The data in the receiving buffer is finally read by the application layer . When the upper layer calls read/recv When such a system calls an interface , In fact, it is not directly reading data from the network , It's about taking data from TCP The receive buffer of is copied to the application layer .
It's like calling read and write When reading and writing files , Instead of reading data directly from disk , Nor does it write data directly to disk , Read and write operations to the file buffer .

Insert picture description here
When data is written to TCP After the send buffer of , Corresponding write/send Function can return , As for when the data in the send buffer will be sent , The problem of how to make it happen is actually caused by TCP Decisive .

The reason why we call it TCP Transport layer control protocol , Because of the way the final data is sent and received , And how to solve various problems encountered in data transmission , It's all by TCP It's up to you , Users only need to copy data to TCP Send buffer for , And from TCP Read data from the receive buffer of .
Insert picture description here
It should be noted that , On both sides of the communication TCP The layers are the same , Therefore, both sides of the communication TCP Both layers have both send buffer and receive buffer .

TCP The significance of the existence of the transmit buffer and the receive buffer

The functions of send buffer and receive buffer ：

Some errors may occur when data is transmitted over the network , At this point, the sender may be required to retransmit data , therefore TCP A transmit buffer must be provided to temporarily store the transmitted data , To avoid data retransmission . Only when the sent data is reliably received by the opposite end , This part of the data in the transmit buffer can be overwritten .
The speed of data processing at the receiving end is limited , In order to ensure that data that has not been processed in time will not be forced to be discarded , therefore TCP A receive buffer must be provided to temporarily store unprocessed data , Because data transmission is resource consuming , We cannot discard the correct message at will . Besides ,TCP The data rearrangement of is also performed in the receiving buffer .

The classic producer consumer model ：

For the send buffer , The upper layer application keeps putting data into the sending buffer , The lower network layer continuously takes out data from the transmission buffer for further encapsulation . At this time, the upper application plays the role of producer , The lower network layer plays the role of consumers , The sending buffer corresponds to “ Trading place ”.
For the receive buffer , The upper layer application continuously takes data from the receiving buffer for processing , The lower network layer keeps putting data into the receiving buffer . At this time, the upper application plays the role of consumer , The lower network layer plays the role of producer , The receiving buffer corresponds to “ Trading place ”.
Therefore, the introduction of sending buffer and receiving buffer is equivalent to the introduction of two producer consumer models , The producer consumer model couples the upper application with the lower communication details , Besides , The introduction of producer consumer model also supports concurrency and uneven free busy .

Window size

When the sending end wants to send data to the opposite end , The essence is to send the data in the sending buffer to the receiving buffer at the opposite end . But the buffer has a size , If the speed at which the receiving end processes data is less than the speed at which the sending end transmits data , Then there will always be a time when the receiving buffer at the receiving end will be full , At this time, the sending end will send data again, which will cause packet loss , And then cause a series of chain reactions such as packet loss and retransmission .

therefore TCP In the header there is 16 Bit window size , This 16 The bit window size is filled with the size of the remaining space in its own receiving buffer , That is, the ability of the current host to receive data .

When the receiving end responds to the data sent by the sending end , You can go through 16 The size of the bit window tells the sender the size of the remaining space in the current receive buffer , At this point, the sender can adjust the speed of sending data according to the window size field .

The larger the window size field , This indicates that the receiving end has a stronger ability to receive data , At this time, the sending end can improve the speed of sending data .
The smaller the window size field , This indicates that the receiving end is less capable of receiving data , At this time, the sending end can reduce the speed of sending data .
If the window size value is 0, This indicates that the receiving buffer at the receiving end is full , At this point, the sending end should not send any more data .

Understand phenomena ：

Writing TCP Socket , We call read/recv When the function reads data from a socket , It may be blocked because there is no data in the socket , In essence, it's because TCP There is no data in the receive buffer of , We are actually blocking in the receive buffer .
And we call write/send Function to write data to the socket , The socket may be blocked because it is full , In essence, it's because TCP The send buffer of is full , We are actually blocking in the send buffer .
In the producer consumer model , If the producer is blocked while producing data , Or consumer consumption data is blocked , Then it must be blocked because some conditions are not ready .

Six flag bits

Why are there flag bits ？

TCP There are many kinds of messages , Except for ordinary messages sent during normal communication , There is also a message sent when establishing a connection to request the establishment of a connection , And the disconnected message sent when the connection is disconnected .
When receiving different types of messages, it is necessary to perform corresponding actions , For example, the message of normal communication needs to be put into the receiving buffer to be read by the upper layer application , The message of establishing and disconnecting connection is not handed over to the user for processing , Instead, you need to make the operating system TCP The layer performs the corresponding handshake and wave .
That is to say, different types of messages correspond to different processing logic , So we should be able to distinguish the types of messages . and TCP It uses the six flag fields in the header to distinguish , These six flag bits only occupy one bit , by 0 Said the false , by 1 Said really .

SYN

In the message SYN Set to 1, It indicates that the message is a request message for connection establishment .
Only during the connection establishment phase ,SYN Is set , In normal communication SYN Will not be set .

ACK

In the message ACK Set to 1, It indicates that the message can confirm the received message .
Generally, except for the first request message, there is no setting ACK outside , Other messages will be basically set ACK, Because the sent data itself has a certain ability to confirm the data sent by the other party , Therefore, when both parties conduct data communication , You can respond to the last data sent by the other party .

FIN

In the message FIN Set to 1, It indicates that the message is a request message for disconnection .
Only during the disconnection phase ,FIN Is set , In normal communication FIN Will not be set .

URG

When the two sides are communicating on the network , because TCP Is to ensure that the data arrives in order , Even if the sender divides the data to be sent into several TCP Message sending , When the data finally arrives at the receiving end, they are all in order , because TCP These can be numbered TCP The messages are rearranged in sequence , Finally, it can ensure that the data is orderly when it reaches the receiving buffer at the opposite end .

TCP Arriving in sequence is also our goal , At this time, the upper layer of the opposite end must also read data in sequence when reading data from the receiving buffer . But sometimes the sender may send some “ Emergency data ”, These data need to be read by the upper layer of the other party , What should I do at this time ？
Insert picture description here
You need to use URG Sign a , as well as TCP In the header 16 Bit emergency pointer .

When URG The flag bit is set to １ when , Need to pass through TCP In the header 16 Bit emergency pointer to find emergency data , Otherwise, in general, there is no need to pay attention to TCP In the header 16 Bit emergency pointer .
16 Bit emergency pointer represents the offset of emergency data in the message .
Because there is only one emergency pointer , It can only identify one location in the data segment , Therefore, only one byte of emergency data can be sent , As for the specific meaning of this byte, we won't discuss it here .

recv The fourth argument to the function flags There's one called MSG_OOB Options for setting , among OOB Is out of band data （out-of-band） For short , Out of band data are some important data , So if the upper layer wants to read emergency data , You can use recv Function to read , And set up MSG_OOB Options .
Insert picture description here
Corresponding send The fourth argument to the function flags There is also a program called MSG_OOB The option to , If the upper layer wants to send emergency data , You can use send Function to write , And set up MSG_OOB Options .

PSH

In the message PSH Set to 1, It is telling the other party to deliver the data in your receiving buffer to the upper layer as soon as possible .

We generally think that ：

When using read/recv When reading data from a buffer , If there is data in the buffer read/recv Function can read data and return , If there is no data in the buffer , So at this time read/recv Function will block , The data will not be read and returned until there is data in the buffer .

Actually, this statement is not accurate , In fact, both the receive buffer and the send buffer have the concept of a watermark .
Insert picture description here

For example, let's suppose TCP The receiving buffer watermark is 100 byte , Then only when there is... In the receive buffer 100 Bytes read/recv The function reads this 100 Bytes of data .
If there is a bit of data in the receive buffer, let read/recv The function read returned , here read/recv It will be read and returned frequently , This will affect the efficiency of reading data （ Switching between kernel mode and user mode is also costly ）.
Therefore, it does not mean that there is only data in the receive buffer , call read/recv Function, you can read the data and return , But when the amount of data in the buffer reaches a certain amount, it can be read .

When... In the message PSH Set to 1 when , In fact, they are telling each other about the operating system , Deliver the data in the receive buffer to the upper layer as soon as possible , Although the data in the receive buffer has not reached the specified watermark . That's why we use read/recv When the function reads data , The number of bytes expected to be read does not necessarily coincide with the number of bytes actually read .

RST

In the message RST Set to 1, It means that the other party needs to re-establish the connection .
When the connection between the communication parties is not established , One party sends data to the other party , In this case, one of the response messages sent by the other party RST The flag bit will be set 1, Ask the other party to re-establish the connection .
When both parties have established a connection for normal communication , If an exception is found in the previously established connection during the communication, it will also require the connection to be re established .

Confirmation response mechanism （ACK）

TCP One of the mechanisms to ensure reliability is the acknowledgement response mechanism .

The acknowledgement response mechanism consists of TCP In the header ,32 Bit sequence number and 32 Bits to confirm the serial number . It needs to be emphasized again , The acknowledgement and response mechanism does not guarantee the reliability of all messages communicated by both parties , But by receiving a reply message from the other party , To ensure that a message you have sent to the other party has been reliably received by the other party .
Insert picture description here

How to understand TCP Number each byte of data ？

TCP Is oriented to a byte stream , We can TCP Both the send buffer and the receive buffer of are imagined as an array of characters .
Insert picture description here

At this point, the upper layer application is copied to TCP Each byte of data in the send buffer naturally has a sequence number , This sequence number is the subscript of the character array , But this subscript is not from 0 At the beginning , But from 1 Beginning to increase progressively .
When both parties communicate , The essence is to copy the data in the sending buffer to the receiving buffer of the other party .
The serial number filled in the header when the sender sends data , In fact, it is among several bytes of data sent , The subscript corresponding to the first byte of data in the transmit buffer .
When the receiver receives the data and responds , The confirmation sequence number in the response header is actually , The subscript corresponding to the next position of the last valid data received in the receive buffer .
When the sender receives the response from the receiver , You can continue sending from the position with the subscript as the confirmation number .

Timeout retransmission mechanism

When both parties communicate with each other on the network , If the data sent by the sender fails to receive a response from the other party within a specific event interval , At this point, the sender will resend the data , This is it. TCP Timeout retransmission mechanism of .

It should be noted that ,TCP Ensure the reliability of communication between both parties , Part of it is through TCP The protocol header of , Another part is through the implementation of TCP The code logic of .

For example, the timeout retransmission mechanism actually means that the sender starts a timer after sending data , If the confirmation response message of the data just sent is not received within this time , The message will be retransmitted , This is through TCP Code logic implementation of , And in the TCP It can't be reflected in the headlines .

There are two cases of packet loss

Packet loss can be divided into two cases , One is that the sent data message is lost , At this time, the sender cannot receive the corresponding response message within a certain time , A timeout retransmission will occur .
Insert picture description here
Another case of packet loss is not packet loss of data sent by the sender , But the response message sent by the other party has lost packets , At this time, the sender will also fail to receive the corresponding response message , And perform timeout retransmission .
Insert picture description here

When packet loss occurs , The sender cannot tell whether the sent data message is lost , Or the response message sent by the other party is lost , In both cases, the sender cannot receive the response message sent by the other party , At this point, the sender can only perform timeout retransmission .
If the sender retransmits overtime due to the loss of the other party's response message , At this time, the receiver will receive a duplicate message data again , But don't worry at this time , The receiver can use the... In the header 32 Bit sequence number to judge whether this message has been received , So as to achieve the purpose of message de duplication .
It should be noted that , When the data in the transmission buffer is sent out , The operating system will not immediately delete or overwrite the data from the send buffer , It will be kept in the send buffer , To avoid the need for timeout retransmission , Until the response message of this data is received , This part of the data in the send buffer can be deleted or overwritten .

Waiting time for timeout retransmission

The timeout retransmission time cannot be set too long or too short .

The timeout retransmission time is set too long , This will cause the other party to fail to receive the corresponding data for a long time after packet loss , This will affect the overall retransmission efficiency .
The timeout retransmission time is set too short , It will cause the other party to receive a large number of repeated messages , It is possible that the response message sent by the other party is still transmitted in the network without packet loss , But then the sender starts to retransmit data , And sending a large number of repeated messages will also be a waste of network resources .

Therefore, the timeout retransmission time must be reasonable , The ideal situation is to find a minimum time , Guarantee “ Confirm that the response can return within this time ”. But the length of time , It is related to the network environment . When the network is good, the retransmission time can be set to be shorter , The retransmission time of the network card can be set to be longer , That is to say, the waiting time for the timeout retransmission setting must fluctuate up and down , So this time cannot be a fixed value .

TCP In order to ensure high-performance communication in any environment , Therefore, the maximum timeout will be calculated dynamically .

Linux in （BSD Unix and Windows So it is with ）, Timeout to 500ms Control for a unit , The time-out of retransmission is determined every time 500ms Integer multiple .
If after a resend , Still no response , The waiting time for the next retransmission is $2\times500$ ms.
If you still don't get an answer , Then the waiting time for the next retransmission is $4\times500$ ms. And so on , Increasing exponentially .
When a certain number of retransmissions are accumulated ,TCP It will be considered that the network or the peer host has an exception , And then force the switch to close the connection .

Connection management mechanism

TCP It's connection-oriented

TCP In fact, all kinds of reliability mechanisms are not from host to host , It's connection based , Connection is strongly related . For example, a server may be accessed by multiple clients after it is started , If TCP Not connection based , This means that there is only one receive buffer on the server side , At this time, the data sent by each client will be copied to the receiving buffer , At this point, these data may interfere with each other .

And we're doing TCP A connection needs to be established before communication , Because of TCP The various reliability guarantees of are based on the connection , To ensure the reliability of data transmission, the premise is to establish a good connection .

Operating system management of connections

Connection oriented is TCP A kind of reliability , Only when the communication is well established can there be various reliability guarantees , There may be a large number of connections on one machine , At this point, the operating system has to manage these connections .

The operating system needs to manage these connections “ Describe first , Reorganize ”, There must be a structure in the operating system that describes connections , This structure contains various attribute fields connected , All the defined connection structures will eventually be organized in some kind of data structure , At this point, the management of the connection by the operating system becomes the addition, deletion, query and modification of the data structure .
Establishing a connection , In fact, this structure is used to define a structure variable in the operating system , Then fill in the various attribute fields of the connection , Finally, insert it into the data structure that manages the connection .
disconnect , Actually, a connection is deleted from the data structure that manages the connection , Release various resources once occupied by the connection .
Therefore, connection management is also cost-effective , This cost is the time cost of managing the connection structure , And the space cost of the storage connection structure .

Three handshakes

Three handshakes

Both parties are engaged in TCP A connection needs to be established before communication , The process of establishing a connection is called a triple handshake .
Insert picture description here
Take the server and client as an example , When the client wants to communicate with the server , You need to establish a connection with the server first , At this point, the client, as the active party, will first send a connection establishment request to the server , Then both sides TCP Three handshakes are automatically performed at the bottom .

The first handshake ： Among the messages sent by the client to the server SYN Bits are set to 1, Indicates a request to establish a connection with the server .
The second handshake ： The server receives the connection request message from the client , Then it sends a connection establishment request to the client and responds to the connection request sent by the client , In this case, the server sends messages to the client SYN Bit and ACK The bits are set to 1.
The third handshake ： After the client receives the message from the server , It is known that the server has received the connection establishment request sent by itself , And ask to establish a connection with yourself , Finally, the client responds to the message sent by the server .

It should be noted that , The connection establishment request initiated by the client to the server , It is a request to establish a communication connection from the client to the server , and TCP It's full duplex communication , Therefore, the server receives the connection establishment request from the client , The server also needs to send a connection establishment request to the client , Request to establish a communication connection from the server to the client method .

Why three handshakes ？

First we need to know , Connection establishment is not 100% successful , When the two sides of the communication shake hands three times , The first two handshakes can ensure that they are received by the other party , Because the previous two handshakes have corresponding next handshake to respond to them , But there is no corresponding response message for the third handshake , If it is sent by the client during the third handshake ACK The message is lost , Then the connection establishment will fail .
Insert picture description here
Although the client has completed three handshakes after initiating the third handshake , But the server did not receive the third handshake from the client , At this point, the server will not establish the corresponding connection . So no matter how many handshakes are used to establish the connection , The reliability of the last handshake is not guaranteed .

Since the establishment of connections is not 100% successful , Therefore, the basis for several handshakes when establishing a connection , Actually, the advantages of several handshakes are more .

The third handshake is the minimum number of times to verify the communication channels of both parties ：

because TCP It's full duplex , Therefore, the core task of connection establishment is actually , Verify whether the communication channels of both parties are connected .
And the three handshakes are just the minimum number of times to verify the communication channels of both sides , After three handshakes, both parties can know whether they and the other party can send and receive data normally .
In the client's view , When it receives a second handshake from the server , Explain that your first handshake was received reliably by the other party , Prove that you can send and the server can receive , At the same time, when you receive the second handshake from the server , This proves that the server can send and receive , At this point, you can prove that you and the server can send and receive .
In the eyes of the server , When it receives the first handshake from the client , Prove that the client can send and receive , When it receives the third handshake from the client , Explain that the second handshake was received reliably by the other party , This proves that you can send and the client can receive , At this point, you can prove that you and the client can send and receive .
Since the three handshakes have been able to verify whether the communication channels of both sides are normal , Of course, more than three handshakes can be verified , But since three times have been verified, there is no need to shake hands more times .

The triple handshake can ensure that the abnormal connection is hung on the client when the connection is established ：

When the client receives the second handshake from the server , The client has already proved that the communication channels of both sides are connected , So when the client sends out a third handshake , This connection has already been established on the client .
Only when the server receives the third handshake from the client , The server knows that the communication channels of both sides are connected , At this time, the corresponding connection will be established on the server side .
Therefore, when the two sides shake hands three times to establish a connection , The time point when the two sides establish the connection is different . If the third handshake sent by the client loses packets , At this point, the corresponding connection will not be established on the server side , On the client side, you need to maintain an abnormal connection for a short time .
And maintaining connectivity requires time and space costs , Therefore, the third handshake also has the advantage of ensuring that when the connection establishment is abnormal , This abnormal connection is attached to the client , Without affecting the server .
Although the client needs to maintain this exception for a short time , However, there are not many abnormal connections on the client side , Unlike servers , Once multiple clients have failed to establish a connection , At this point, the server needs to spend a lot of resources to maintain these abnormal connections .
Besides , The abnormal connection when the connection fails to be established will not be maintained all the time . If the server fails to receive the third handshake from the client for a long time , The second handshake will be retransmitted over time , At this point, the client has the opportunity to re issue a third handshake . Or when the client thinks the connection is established and sends data to the server , At this point, the server will find that there is no connection with the client and ask the client to re-establish the connection .

therefore , Here are two reasons to use a triple handshake when establishing a connection ：

The third handshake is the minimum number of times to verify the communication channels of both parties , Enable the connection that can be established to be established as soon as possible .
The triple handshake can ensure that the abnormal connection is hung on the client when the connection is established （ Risk transfer ）.

Change of state during three handshakes

Insert picture description here
The state changes during the three handshakes are as follows ：

At the beginning, both the client and the server are in CLOSED state .
In order to receive the connection request from the client , Need by CLOSED The status changes to LISTEN state .
At this point, the client can initiate three handshakes to the server , When the client initiates the first handshake , The status changes to SYN_SENT state .
be in LISTEN After the server in status receives the connection request from the client , Put the connection in the kernel wait queue , And send a second handshake to the client , At this point, the state of the server changes to SYN_RCVD.
When the client receives the second handshake from the server , Then send the last handshake to the server , At this point, the client connection has been established , The status changes to ESTABLISHED.
The server receives the last handshake from the client , The connection was established successfully , At this point, the status of the server also changes to ESTABLISHED.

So far, the three handshakes are over , Both sides of the communication can start data interaction .

The relationship between socket and triple handshake

Before the client initiates the connection establishment request , The server needs to enter first LISTEN state , At this point, the server needs to call the corresponding listen function .
When the server enters LISTEN Post state , The client can initiate three handshakes with the server , At this point, the client calls connect function .
It should be noted that ,connect Function does not participate in the underlying triple handshake ,connect The handshake() function only initiates three handshakes . When connect When function returns , Or the bottom layer has successfully completed three handshake connections , Or the bottom three handshakes failed .
If the server and the client successfully complete three handshakes , At this point, a connection will be established on the server side , But this connection is in the waiting queue of the kernel , The server side needs to call accept Function to get the established connection .
After the server obtains the established connection , Both parties can call read/recv Functions and write/send Function to interact with data .

Four waves

The process of waving four times

Because both parties need to maintain the connection at a cost , So when both sides TCP After the communication is over, you need to disconnect , The process of disconnecting is called four waves .
Insert picture description here
Take the server and client as an example , When the communication between the client and the server ends , You need to disconnect from the server , At this point, you need to wave four times .

First wave ： Among the messages sent by the client to the server FIN Bits are set to 1, Indicates a request to disconnect from the server .
Second wave ： The server responds to the disconnection request sent by the client .
Third wave ： The server received a request from the client to disconnect , And there is no data to send to the client , The server will send a disconnect request to the client .
Fourth wave ： The client responds to the disconnection request sent by the server .

After four waves, the connection between the two sides is really disconnected .

Why four waves ？

because TCP It's full duplex , When establishing a connection, it is necessary to establish a connection between both parties , The same is true when disconnected . When disconnecting, it is not only necessary to disconnect the communication channel from the client to the server , Also disconnect the communication channel from the server to the client , Every two waves correspond to closing the communication channel in one direction , Therefore, four waves are required to disconnect .
It should be noted that , The second and third of the four waves cannot be combined , Because the third handshake is a request sent to the client when the server wants to disconnect from the client , When the server receives the request from the client to disconnect and responds , The server may not immediately initiate a third wave , Because the server may still have some data to send to the client , Only after the server sends these data will it send a third wave to the client .

Change of state during four waves

Insert picture description here
The state changes during the four waves are as follows ：

Before waving, both the client and the server are in the state after the connection is established ESTABLISHED state .
In order to disconnect from the server, the client actively sends a disconnect request to the server , At this point, the status of the client changes to FIN_WAIT_1.
The server responds to the disconnection request sent by the client , At this point, the state of the server changes to CLOSE_WAIT.
When the server has no data to send to the client , The server will send a disconnect request to the client , Waiting for the last ACK arrival , At this point, the state of the server changes to LASE_ACK.
After the client receives the third wave from the server , The last response message will be sent to the server , At this time, the client enters TIME_WAIT state .
When the server receives the last response message from the client , The server will shut down the connection completely , Turn into CLOSED state .
The client will wait for a 2MSL（Maximum Segment Lifetime, Maximum message lifetime ） To enter CLOSED state .

So far, the four waves are over , Communication parties disconnected successfully .

The relationship between socket and four waves

The client initiates a disconnect request , The corresponding is that the client actively calls close function .
The server initiates a disconnect request , The corresponding is that the server actively invokes close function .
One close The corresponding is two waves , Both sides should call close, So it's four waves .

CLOSE_WAIT

When both sides wave their hands four times , If only the client calls close function , The server does not call close function , The server will enter CLOSE_WAIT state , The client will enter FIN_WAIT_2 state .
But only after four waves can the connection be truly disconnected , At this point, both parties will release the corresponding connection resources . If the server does not actively close unwanted file descriptors , At this point, there will be a large number of servers in CLOSE_WAIT State connection , Each connection will occupy the resources of the server , Eventually, the available resources of the server will become less and less .
So if you don't close unused file descriptors in time , In addition to file descriptor disclosure , It may also cause the connection resources to not be fully released , This is also a memory leak problem .
So when writing network socket code , If you find a large number of servers in CLOSE_WAIT State connection , At this point, you can check whether the server did not call in time close Function to close the corresponding file descriptor .

TIME_WAIT

Solution to packet loss in the first three of the four waves ：

I lost my bag when I waved for the first time ： The client cannot receive the response from the server , And then perform timeout retransmission .
The second wave lost the bag ： The client cannot receive the response from the server , And then perform timeout retransmission .
The third wave lost the bag ： The server cannot receive the response from the client , And then perform timeout retransmission .
The fourth wave lost the bag ： The server cannot receive the response from the client , And then perform timeout retransmission .

If the client enters immediately after the fourth wave CLOSED state , At this time, the server retransmits overtime , But no response from the client , Because the client has closed the connection .
Insert picture description here
The server fails to respond after several timeout retransmissions , Finally, the corresponding connection will be closed , However, the abandoned connection needs to be maintained during the server's repeated overtime retransmissions , This is very unfriendly to the server .

To avoid that , Therefore, the client does not enter immediately after four waves CLOSED state , But into TIME_WAIT Wait for status , At this time, if the packet of the fourth wave is lost , The client can also receive the message retransmitted by the server and then respond .

TIME_WAIT The necessity of state existence ：

The client enters after four waves TIME_WAIT state , If the packet of the fourth wave is lost , The client can still receive the retransmission from the server for a period of time FIN Message and respond to it , It can guarantee the last one with a high probability ACK Received by the server .
When the client sends out the last wave , The historical communication data of both parties may not have been sent to the other party . Therefore, the client enters after four waves TIME_WAIT state , It can also ensure that the data on the communication channels of both sides can be dissipated in the network as much as possible .

Actually, after the fourth wave, the packet is lost , There may be a problem with the network status of both parties , Although the client has not yet closed the connection , The connection disconnection request sent by the server is not received , At this point the client TIME_WAIT Wait a few moments and eventually close the connection , The server will also close the connection after multiple timeout retransmissions . Although this situation also allows the server to maintain an idle connection , But it's a minority , introduce TIME_WAIT The status is to strive for the maintenance cost of the client that actively initiates four waves .

therefore TCP The reliability of connection and disconnection cannot be fully guaranteed ,TCP The guarantee is that after the connection is established , And the reliability of communication data between both parties before disconnecting .

TIME_WAIT What is the waiting time for ？

TIME_WAIT The waiting time should be neither too long nor too short .

Too long will keep the waiting party for a long time TIME_WAIT state , In this time, the waiting party also needs to spend the cost to maintain the connection , This is also a waste of resources .
Too short may not achieve our original goal , There is no guarantee ACK The other party has a high probability of receiving , There is no guarantee that the data will dissipate in the network , here TIME_WAIT There is no meaning for .

TCP Provisions of the agreement , The party who actively closes the connection should be in TIME_WAIT state , Wait for two MSL（Maximum Segment Lifetime, Maximum message lifetime ） Time to enter CLOSED state .

MSL stay RFC1122 Two minutes is specified in , But the implementation of each operating system is different , For example Centos7 The default configuration value on is 60s. We can go through cat /proc/sys/net/ipv4/tcp_fin_timeout Order to see MSL Value .
Insert picture description here
TIME_WAIT The waiting time of is set to two MSL Why ：

MSL yes TCP The maximum lifetime of a message , therefore TIME_WAIT The state persists 2MSL Words , It can ensure that the unreceived or late message segments in both transmission directions have disappeared .
At the same time, it is also the time to ensure the reliable arrival of the last message in theory .

flow control

TCP The speed at which the sending end sends data can be determined according to the ability of the receiving end to receive data , This mechanism is called flow control （Flow Control）.

The speed of data processing at the receiving end is limited , If the sender sends too fast , This causes the buffer at the receiving end to be full , At this time, the sending end continues to send data , It will cause packet loss , Then it causes a series of chain reactions such as packet loss and retransmission .

Therefore, the receiving end can inform the sending end of its ability to receive data , Let the sender control the speed of sending data .

The receiver puts the size of the buffer it can receive into TCP In the first part “ Window size ” Field , adopt ACK Notification sender .
The larger the window size field , The higher the throughput of the network .
Once the receiver finds that its buffer is almost full , It will set the window size to a smaller value and inform the sender .
After the sender receives this window , It will slow down the sending speed .
If the receiving buffer is full , The window value will be set to 0, At this time, the sender no longer sends data , But you need to send a window probe segment periodically , Make the receiving end tell the sending end the window size .

When the sending end knows that the receiving end's ability to receive data is 0 Will stop sending data , At this time, the sending end will know when to continue sending data through the following two ways .

Waiting for notification . The upper layer of the receiving end reads the data in the receiving buffer , The receiving end sends a message to the sending end TCP message , Take the initiative to inform the sender of its own window size , After the sender learns that there is space in the receiving buffer of the receiver, it can continue to send data .
Take the initiative to ask . The sender sends messages to the receiver at regular intervals , This message does not carry valid data , Just to ask the sender about the window size , The sending end can continue to send data until the receiving buffer at the receiving end has space .

16 Is the maximum number 65535, that TCP The biggest window is 65535 Do you ？

In theory, it is true , But actually TCP In the header 40 The option field of byte contains a window enlargement factor M, The actual window size is the value of the window field shifted left M Bit got .

How to know the window size of the other party when sending data to the other party for the first time ？

Both parties are engaged in TCP Three handshakes are required to establish a connection before communication , When both parties shake hands, in addition to verifying whether the communication channels of both parties are smooth , Other information is also exchanged , This includes informing the other party of their ability to receive , Therefore, both parties have known the other party's ability to receive data before formally starting communication , Therefore, the buffer overflow problem will not occur when both parties send data .

The sliding window

Send multiple data in succession

Both parties are engaged in TCP When communicating, you can send multiple pieces of data to the other party at one time , This overlaps the time waiting for multiple responses , And then improve the efficiency of data communication .
Insert picture description here
It should be noted that , Although both sides are engaged in TCP When communicating, a large number of messages can be sent to the other party at one time , However, you cannot package and send all the data in your own sending buffer to the opposite end , When sending data, the receiving capacity of the other party should also be considered .

The sliding window

The sender can send multiple messages to the other party at one time , At this time, it means that a considerable part of the data in the sent message has not received the response for the time being .

In fact, the data in the send buffer can be divided into three parts ：

Has been sent and received ACK The data of .
Sent but not received ACK The data of .
Data not yet sent .

The second part of the transmit buffer is called the sliding window .（ Some people call these three parts as a whole sliding window , The second part is called window size ）
Insert picture description here
The sliding window describes , The sender doesn't have to wait ACK The largest amount of data can be sent at a time .

The greatest significance of sliding window is to improve the efficiency of sending data ：

The size of the sliding window is equal to the smaller value of the size of the opposite window and the size of its own congestion window , Because when sending data, we should not only consider the receiving capacity of the other party , Also consider the current state of the network .
Let's not consider the congestion window , And assume that the window size of the other party is always fixed to 4000, At this point, the sender does not have to wait ACK The data that can be sent at one time is 4000 byte , So the size of the sliding window is 4000 byte .（ Four sections ）
Now send continuously 1001-2000、2001-3000、3001-4000、4001-5000 These four paragraphs , There's no need to wait for any ACK, It can be sent directly .
When receiving the response from the other party, the confirmation number is 2001 when , explain 1001-2000 This data segment has been received by the other party , At this point, the data segment should be included in the first part of the transmit buffer , Because we assume that the window size of the other party is always 4000, So the sliding window can now be moved to the right , Continue to send 5001-6000 Data section of , And so on .
The larger the sliding window , The higher the throughput of the network , At the same time, it also shows that the receiving ability of the other party is very strong .

When the data segment sent by the sender continuously receives the corresponding ACK when , You can receive ACK To the left of the sliding window , And decide according to the size of the current sliding window , Whether to place the data on the right side of the sliding window into the sliding window .

TCP The retransmission mechanism of requires that data sent but not acknowledged be temporarily saved , This part of the data is actually located in the sliding window , Only the data on the left side of the sliding window can be overwritten or deleted , Because this part of data is sent and received reliably by the other party , So the sliding window does not receive except for the limit ACK In addition to the data that can be sent directly , Sliding windows can also support TCP The retransmission mechanism .

Will the sliding window move to the right as a whole ？

Sliding windows may not move right as a whole , Take the example just now , Suppose the other party has received 1001-2000 The data segment and the response , But the upper layer of the other side never reads data from the receiving buffer , When the other party receives 1001-2000 Data segment of , The size of the other side's window is determined by 4000 Change into 3000.

When the sender receives the response from the other party, the sequence number is 2001 when , Will be 1001-2000 To the left of the sliding window , But at this time, the receiving capacity of the other party becomes 3000, And when 1001-2000 After the data segment of is placed on the left side of the sliding window , The size of the sliding window is just 3000, Therefore, the right side of the sliding window cannot be extended to the right .
Insert picture description here
Therefore, the sliding window does not necessarily move to the right as a whole in the process of moving to the right , Because the receiving ability of the other party may be constantly changing , Thus, the sliding window will become wider or narrower .

How to realize sliding window

TCP Both receive and send buffers are treated as an array of characters , The sliding window can actually be regarded as a range limited by two pointers , For example, we use $s t a r t$ Point to the left side of the sliding window , $e n d$ It points to the right side of the sliding window , At this time in $s t a r t$ and $e n d$ What is within an interval can be called a sliding window .

When the sender receives the response from the other party , If the confirmation number in the response is $x$ , Window size is $w i n$ , At this point, you can put start Updated to $x$ , And will be $e n d$ Updated to $s t a r t + w i n$ .
Insert picture description here

Packet loss problem

When the sending end sends multiple message data at one time , At this time, packet loss can also be divided into two types .

Situation 1 ： The packet has arrived ,ACK Packet loss .
Insert picture description here
When the sending end sends multiple message data continuously , part ACK Packet loss doesn't matter , At this point, you can use the following ACK Confirm .

Like in the picture 2001-3000 and 4001-5000 Corresponding to the packet of ACK lost , But as long as the sender receives the last 5001-6000 Packet response , At this point, the sender will know 2001-3000 and 4001-5000 The packet of is actually received by the receiving end , Because if the recipient does not receive 2001-3000 and 4001-5000 The data package of is set to confirm that the sequence number is 6001 Of , Confirm serial number is 6001 The meaning of serial number is 1-6000 I have received all the byte data , Next time you should start with the serial number 6001 Start sending byte data of .

Situation two ： The packet is lost .
Insert picture description here

When 1001-2000 After the packet of is lost , The sender will always receive a confirmation with the serial number of 1001 Response message of , Is to remind the sender “ The next time it should start with the serial number 1001 Start sending byte data of ”.
If the sender receives three consecutive acknowledgments, the serial number is 1001 Response message of , At this point will be 1001-2000 Retransmit packets of .
At this time, when the receiving end receives 1001-2000 After the packet , It will directly send the confirmation number as 6001 Response message of , because 2001-6000 In fact, the data receiving end of has already received .

This mechanism is called “ High speed retransmission control ”, It's also called “ Fast retransmission ”.

It should be noted that , Fast retransmission requires a balance between a large number of data retransmissions and individual data retransmissions , In this example, the sender does not know that it is 1001-2000 This packet is missing , When the sender repeatedly receives the confirmation, the serial number is 1001 In response to the message , Theoretically, the sender should 1001-7000 All the data is retransmitted , But this may cause a large amount of data to be transmitted repeatedly , So the sender can try to send 1001-2000 Retransmit packets of , Then continue to decide whether to resend other packets according to the confirmed sequence number after resending .

Must the data in the sliding window not be received by the other party ？

The data in the sliding window is the data that can not be confirmed by the other party for the time being , It does not mean that the data in the sliding window must not be received by the other party , Some data in the sliding window may have been received by the other party , But it may be because a part of the data in the sliding window near the left side of the sliding window , Packet loss occurs during transmission , As a result, the data that has been received by the other party cannot be responded .

For example 1001-2000 If a packet is lost during transmission , At this time, though 2001-5000 The other party has received all the data , At this time, the confirmation serial number sent by the other party can only be 1001, When the sender reissues 1001-2000 After the packet , The confirmation number sent by the other party will change to 5001, In this case, the transmission buffer is 1001-5000 The data of will be immediately placed to the left of the sliding window .
Insert picture description here

Fast retransmission VS Over time retransmission

Fast retransmission is able to retransmit data quickly , When the sender receives the same response three times in a row, it will trigger a fast retransmission , Unlike the timeout retransmission, the retransmission timer needs to be set , Retransmission takes place after a fixed time .
Although fast retransmission can quickly determine packet loss , But fast retransmission cannot completely wait for timeout retransmission , Sometimes, after a packet is lost, it may not receive three repeated replies from the other party , At this point, the fast retransmission mechanism cannot be triggered , Only timeout retransmission can be performed .
Therefore, fast retransmission is an improvement in efficiency , But overtime retransmission is the minimum policy of all retransmission mechanisms , It's also essential .

Congestion control

Why congestion control ？

Two hosts are in progress TCP In the process of communication , It is normal for individual packets to lose packets , At this time, the packet can be reissued through fast retransmission or overtime retransmission . However, if a large number of packets are lost during the communication between the two parties , At this time, it can not be considered as a normal phenomenon .

TCP It not only considers the problem of communication dual end host , At the same time, the problem of network is also considered .

flow control ： The receiving capacity of the peer receiving buffer is considered , And then control the speed at which the sender sends data , Avoid end-to-end receive buffer overflow .
The sliding window ： The consideration is that the sender does not have to wait ACK The largest amount of data can be sent at a time , Thus, the efficiency of sending data at the sending end is improved .
Congestion window ： What we consider is the network problem when both sides communicate , If the data sent exceeds the size of the congestion window, it may cause network congestion .

A small amount of packet loss occurs during network communication between both parties TCP Permissible , But once a large number of packets are lost , At this time, quantitative change causes qualitative change , The nature of this matter has changed , here TCP It is no longer speculated that the problem is that both parties receive and send data , The judgment is that there is a congestion problem in the communication channel network of both sides .

How to solve the problem of network congestion ？

When the network breaks down in a large area , Both sides of communication act as two small hosts in the network , It doesn't seem to do anything about it , but “ No snowflake is innocent at the time of the avalanche ”, Network problems must be the result of the joint action of most hosts in the network .

If the hosts in the network plug a large amount of data into the network at the same time , At this time, there may be a long queue of packets under the routers of some key nodes in the network , Finally, the message cannot reach the peer host within the timeout time , This also leads to packet loss .
When network congestion occurs , Although both sides of the communication can not put forward a particularly effective solution , But both hosts can do without increasing the burden of the network .
If a large number of packets are lost during communication between the two parties , These messages should not be retransmitted immediately , Instead, you should send less data or even no data , After the network condition is restored, both parties will slowly restore the data transmission rate .

It should be noted that , Network congestion affects more than one host , And almost all the hosts in the network , At this time, all use TCP The host of the transmission control protocol will execute the congestion avoidance algorithm .

So congestion control seems to be just talking about the communication strategy on a host , In fact, this policy is the one that all hosts will follow after a network crash . In case of network congestion , All hosts in the network will be affected , At this point, all hosts must perform congestion avoidance , Only in this way can we effectively alleviate the problem of network congestion . In this way, the avalanche will not happen , Or the avalanche can be recovered as soon as possible .

Congestion control

Although sliding windows can send large amounts of data efficiently and reliably , But if you send a lot of data at the beginning , May cause some problems . Because there are many computers on the network , It is possible that the current network state is already congested , So without knowing the current network state , Sending a lot of data rashly , It may cause network congestion .

therefore TCP Slow start mechanism is introduced , At the beginning of communication, send a small amount of data to explore the way , Find out the current state of network congestion , And then decide how fast to transmit data .
Insert picture description here

TCP In addition to the concept of window size and sliding window , There is also a window called the congestion window . Congestion window is the threshold that may cause network congestion , If the data sent at one time exceeds the size of the congestion window, it may cause network congestion .
At the beginning of sending data, the congestion window size is defined as 1, For every one received ACK The value of the reply congestion window is increased by one .
Every time you send a packet , Compare the congestion window with the window size fed back by the host at the receiving end , Take the smaller value as the window size of the actually sent data , The size of the sliding window .

For every one received ACK The value of the reply congestion window is increased by one , At this time, the congestion window grows exponentially , If the ability of the other party to receive data is not considered first , The sliding window only depends on the size of the congestion window , The size of the congestion window changes as follows ：

Congestion window	The sliding window
1= $2^0$	1
1+1= $2^1$	2
2+2= $2^2$	4
4+4= $2^3$	8
…	…

But exponential growth is very fast , therefore “ Slow start ” In fact, it was slow at the beginning , But the later it goes, the faster it grows . If the value of congestion window keeps increasing exponentially , At this point, it may cause network congestion again in a short time .

In order to avoid network congestion again in a short time , Therefore, it is not possible to keep the congestion window growing exponentially .
At this point, the threshold of slow start is introduced , When the size of the congestion window exceeds this threshold , No longer grow exponentially , And grow in a linear fashion .
When TCP When it started , The slow start threshold is set to the maximum window size of the other side .
Every time you time out and resend , The slow start threshold becomes half the current congestion window , At the same time, the value of congestion window is reset to 1, Go on like this .

Here's the picture ：
Insert picture description here
Illustration ：

Exponential growth . Just started TCP The value of congestion window during communication is 1, And continue to grow exponentially .
Add more . The threshold value of slow start is initially the maximum window size of the other side , The initial value of the slow start threshold in the figure is 16, So when the value of congestion window increases to 16 It will no longer grow exponentially , And became a linear growth .
Multiplication is reduced . The congestion window grows linearly , Increasing to 24 If network congestion occurs , At this point, the threshold of slow start will become half of the current congestion window , That is to say 12, And the value of the congestion window is reset to 1, So the next time the congestion window changes from exponential growth to linear growth, the value of the congestion window should be 12.

When the host is communicating with the network , In fact, it is constantly increasing exponentially 、 Addition increases and multiplication decreases .

It should be noted that , Not all hosts are growing exponentially at the same time 、 Increasing in addition and decreasing in multiplication . Each host thinks that the size of the congestion window is not necessarily the same , Even if two hosts in the same region think that the size of the congestion window is not necessarily the same at the same time . So at the same time , Some hosts may be communicating normally , The other part of the host may have experienced network congestion .

Delay response

If the host receiving the data immediately after receiving the data ACK The reply , The returned window may be smaller .

Assume that the remaining buffer space at the receiving end of the other party is 1M, The other party receives 500K After the data of , If you proceed immediately ACK The reply , At this point, the returned window is 500K.
But the data processing speed of the actual receiving end is very fast ,10ms Will be in the receive buffer within 500K Data consumption of .
under these circumstances , The receiving end processing is far from reaching its limit , Even if the window is a little bigger , Can handle it .
If the receiving end waits a little longer ACK The reply , Such as waiting for 200ms Answer again , Then the returned window size is 1M.

It should be noted that , The purpose of delayed response is not to ensure reliability , Instead, set aside some time for the data in the receiving buffer to be consumed by the upper application layer as much as possible , At this point in the process ACK The report window size can be larger when responding , Thus, the network throughput is increased , And then improve the data transmission efficiency .
Insert picture description here
Besides , Not all packets can be delayed .

Quantitative restriction ： Every N Just one response per packet .
The time limit ： Answer once when the maximum delay time is exceeded （ This time will not cause false timeout retransmission ）.

The specific number of delayed replies and the timeout time , Depending on the operating system , commonly N take 2, The timeout time is taken as 200ms.

Take a reply

The piggyback response is actually TCP The most common way to communicate , It's like a mainframe A Host computer B Sent a message , When the host B After receiving this message, you need to ACK The reply , But if the host B At this point, you also need to give it to the host A There's news , At this point ACK You can take a ride , Instead of sending a single ACK The reply , At this time, the host B The message sent sends data , The response to the received data is completed , This is called a piggyback response .
Insert picture description here
The most intuitive aspect of piggyback response is actually the efficiency of sending data , At this time, when both parties communicate, they can no longer send a simple confirmation message .

Besides , Since the message carrying the reply carries valid data , Therefore, the other party will respond to the message after receiving it , When this response message is received, it can not only ensure that the sent data is reliably received by the other party , At the same time, it can also ensure that ACK The response was also received by the other party reliably .

Byte stream oriented

When creating a TCP Of socket when , At the same time, a send buffer and a receive buffer will be created in the kernel .

call write Function to write data to the send buffer , here write Function can return , Next, the data in the transmit buffer is sent by TCP Sent by itself .
If the number of bytes sent is too long ,TCP It will be split into multiple packets and sent . If the number of bytes sent is too short ,TCP It may be left in the send buffer first , Wait until the right time to send .
When receiving data , The data is also from the network card driver to the receiving buffer of the kernel , You can call read Function to read the data in the receive buffer .
And call read When the function reads data from the receive buffer , It can also be read by any number of bytes .

Because of the buffer ,TCP The reading and writing of the program do not need to match one by one , for example ：

Write 100 Bytes of data , Can be called once write Write 100 byte , You can also call 100 Time write, Write one byte at a time .
read 100 Bytes of data , There is no need to consider how to write it , Either once read100 Bytes , Or once read A byte , repeat 100 Time .

Actually for TCP Come on , It doesn't care what data is in the send buffer , stay TCP It seems that these are just byte data , Its task is to accurately send these data to the receiving buffer of the other party , As for how to interpret these data, it is entirely up to the upper application to decide , This is called byte oriented streaming .

Stick package problem

What is a sticky bag ？

First of all, make it clear , In the problem of sticking packets “ package ”, It means Application layer packets .
stay TCP In the head of the agreement , There is no such thing as UDP Same “ packet length ” Such fields .
From the perspective of the transport layer ,TCP It's a message by message , Put it in the buffer according to the sequence number .
But from the perspective of the application layer , All you see is a string of consecutive bytes of data .
So the application sees this string of bytes of data , I don't know from which part to which part , Is a complete application layer packet .

How to solve the sticking problem

To solve the sticking problem , The essence is to define the boundary between messages .

For fixed length bags , Make sure to read by a fixed size every time .
For longer packets , It can be in the header , A field that specifies the total length of a package , So you know where the package ends . such as HTTP The header contains Content-Length attribute , Indicates the length of the body .
For longer packets , You can also use explicit separators between packages . Because the application layer protocol is determined by the programmer himself , Just make sure that the separator doesn't conflict with the text .

UDP Whether there is sticking problem ？

about UDP, If there is no upper level delivery data yet ,UDP The length of the message is still , meanwhile ,UDP It is one that delivers data to the application layer , There are clear data boundaries .
From the perspective of application layer , Use UDP When , Or receive the full UDP message , Or not , There will be no “ Half ” The situation of .

therefore UDP There is no sticking problem , The root cause is UDP In the header 16 position UDP Length recorded UDP The length of the message , therefore UDP At the bottom, the boundary between messages is defined , and TCP The sticking problem exists because TCP Is oriented to a byte stream ,TCP There is no clear boundary between messages .

TCP Abnormal situation

Process termination

When the client accesses the server normally , If the client process suddenly crashes , At this point, what happens to the established connection ？

When a process exits , All file descriptors that have been opened by the process are automatically closed , So when the client process exits , It is equivalent to automatically calling close Function closes the corresponding file descriptor , At this time, the operating systems of both sides will normally complete four waves at the bottom , Then release the corresponding connection resources . in other words , The file descriptor is released when the process terminates ,TCP The bottom layer can still send FIN, It is no different from the normal exit of the process .

Machine restart

When the client accesses the server normally , If you restart the client host , At this point, what happens to the established connection ？

When we choose to restart the host , The operating system will kill all processes first and then shut down and restart , So the machine restart and process termination are the same , At this time, the operating systems of both parties will also complete four waves normally , Then release the corresponding connection resources .

The machine is powered down / The network cable is disconnected

When the client accesses the server normally , If the client is suddenly disconnected , At this point, what happens to the established connection ？

When the client drops the line , The server cannot know that the client is disconnected in a short time , Therefore, the connection established with the client will be maintained on the server side , But this connection will not last forever , because TCP There is a survival strategy .

The server periodically checks the presence of the client , Check whether the other party is online , If you don't receive it many times in a row ACK The reply , The server will close the connection .
Besides , The client may also periodically report to the server “ Report peace ”, If the server does not receive the message from the client for a long time , The server will also close the corresponding connection .

The server regularly asks the existence status of the client , A heartbeat mechanism called a keep alive timer , By TCP Realized . Besides , Some protocols of application layer , There are some similar detection mechanisms , For example, based on long connections HTTP, They will also regularly check the existence status of each other .

TCP Summary

TCP The agreement is so complicated because TCP It is necessary to ensure reliability , At the same time, improve the performance as much as possible .

reliability ：

Inspection and .
Serial number .
Confirm response .
Over time retransmission .
Connection management .
flow control .
Congestion control .

Improve performance ：

The sliding window .
The fast retransmission .
Delay response .
Take a reply .

It should be noted that ,TCP Some of these mechanisms can be implemented through TCP Reflected in the header , But there are still some that are embodied by code logic .

TCP Timer

Besides ,TCP Various timers are also set .

Retransmission timer ： To control missing segments or discarded segments , That is, the waiting time for message segment confirmation .
Stick to the timer ： It is specially set up for the other party's zero window notification , That is, the time interval between sending window probes to the other party .
Life keeping timer ： To check the existence of idle connections , That is, the time interval between sending probe messages to the other party .
TIME_WAIT Timer ： After four waves , The length of time that the party actively disconnecting needs to wait .

Understand the transmission control protocol

TCP In fact, the various mechanisms of do not talk about the real sending of data , These are called data transmission strategies .TCP The protocol makes decisions in network data transmission , It provides theoretical support , such as TCP It is required that when the sent message cannot be received within a period of time ACK The response should be retransmitted over time , The real data transmission is actually performed by the underlying IP and MAC Frame complete .

TCP Make decisions and IP+MAC Do the execution , We call them communication details , Their ultimate goal is to transmit data to the peer host . The purpose of data transmission is determined by the application layer . So what the application layer decides is the meaning of communication , What the transport layer and its lower layers decide is the mode of communication .

be based on TCP Application layer protocol

The common ones are based on TCP The application layer protocol of is as follows ：

HTTP（ Hypertext transfer protocol ）.
HTTPS（ Secure data transfer protocol ）.
SSH（ Secure Shell Protocol ）.
Telnet（ Remote terminal protocol ）.
FTP（ File transfer protocol ）.
SMTP（ E-mail transfer protocol ）.

Of course , Including your own writing TCP Application layer protocol defined when the program is running .

Talk about ECs

SSH That is to say Xshell The underlying protocol of , We use Xshell Is actually using Xshell Of ssh The client connects to our ECs .

We are using Xshell when , Can pass ssh user name @ Host name （IP Address ） Connect to ECS via . In fact, it is because there are some problems in our ECs sshd Such a service .
Insert picture description here
This is actually ssh The server side of the service , What we use ssh user name @ Host name （IP Address ） In the command ssh the truth is that ssh The client of , Therefore, when we connect to the ECS, we are essentially using ssh Client connections for ssh Server for .

Use netstat The command can see the corresponding ssh service . Insert picture description here
Various commands we type on the ECS , Finally, it will be sent to the server through the network socket , The server interprets our commands , Then perform corresponding actions .