当前位置:网站首页>Base64 encoding

Base64 encoding

2022-07-07 14:28:00 Snail games

Basic concepts

Base64 The term was originally used in “MIME Content transfer coding specification ” Proposed in .Base64 It's not an encryption algorithm , Although the encoded string looks a bit encrypted . It's actually a kind of “ Binary to text ” The coding method of , It can convert any given binary data ( mapping ) by ASCII String form , So that binary data can be transferred smoothly in a text only environment . For example, support MIME Email app for , Or need to be in XML Store complex data in ( For example, pictures ) when .

Base64 Is a kind of use 64 A method of representing arbitrary binary data with characters . It's a coding method , Instead of encryption . It converts binary data into 64 individual “ Printable characters ”, Completed the data in HTTP Transmission over the Protocol .

Under what circumstances will we use Base64 Well ?Base64 Generally used in HTTP Transfer binary data under the protocol , because HTTP The protocol is hypertext , So in HTTP To transmit binary data under the protocol, we need to convert binary data into character data . However, direct conversion is not possible . You can only transmit characters over the Internet .

What are printable characters ? stay ASCII The code says ,0-31、127 this 33 Characters are control characters ,32-126 this 95 Characters are printable ( See ASCII Code comparison table ), That is to say, network transmission can only transmit this 95 Characters , Characters that are not in this range cannot be transferred . So how can I transfer other characters ? One way is to use Base64.

Base64, Is the use of 64 A printable character to represent binary data . this 64 Characters include upper and lower case letters 、 Numbers 、+ and /, There are also special characters used to fill gaps =.

Be careful : because base64 Coding used 8 Bit character to represent... In the message 6 bits , therefore base64 The encoded string is approximately larger than the original value 33%.

Base64 Encoding table

Code value character Code value character Code value character Code value character Code value character Code value character Code value character Code value character
0A8I16Q24Y32g40o48w564
1B9J17R25Z33h41p49x575
2C10K18S26a34i42q50y586
3D11L19T27b35j43r51z597
4E12M20U28c36k44s520608
5F13N21V29d37l45t531619
6G14O22W30e38m46u54262+
7H15P23X31f39n47v55363/

in other words , At most if the index is converted to the corresponding binary data 6 individual Bit. However ASCII Code needs 8 individual Bit To express , So how to use 6 individual Bit To express 8 individual Bit What about the data? ?6 individual Bit Of course, you can't store 8 individual Bit The data of , however 4×6 individual Bit Can be stored 3×8 individual Bit The data of ! As shown in the following table :

 

You can see “Son” adopt Base64 The code was converted to “U29u”. This is just a good situation ,3 individual ASCII The character just translates to the corresponding 4 individual Base64 character . however , When the number of characters to be converted is not 3 What to do in the case of multiple of ?Base64 Regulations , When the character to be converted is not 3 A multiple of the , Make up for all 0 It's a good way 3 Multiple , The details are shown in the table below :

 Insert picture description here
Every time 6 individual Bit For a group , The first group is converted to characters “U”, Fill in at the end of the second group 4 individual 0 Convert to character “w”. The rest of the use “=” replace . That is, the characters “S” adopt Base64 After coding is “Uw==”. This is it. Base64 The coding process .

If the binary data to be encoded is not 3 Multiple , In the end, there will be 1 Or 2 What about a byte ?Base64 use \x00 Byte after complement at the end , Add... At the end of the code 1 Or 2 individual = Number , Indicates how many bytes are filled , When decoding , Will automatically remove .

This is the total number of bits in bytes, not 6 In the case of multiples of , When the rest 4 When a , We need to 2 individual = Come together 8 Multiple ; When the rest is 2 When a , We need to make up 1 individual = Come together 8 Multiple .
 Completion mechanism

 

To achieve Base64, First of all, we need to choose the appropriate 64 Characters form a character set . A general rule is to choose from some common character set 64 Printable characters , In this way, data loss during transmission can be avoided ( Unprintable characters may be treated as special characters during transmission , Which leads to loss ). for example ,MIME Of Base64 The implementation uses capital letters 、 Lowercase letters and 0~9 As the first 62 Characters . Other implementations usually follow MIME In this way , And only in the end 2 Different characters , for example UTF-7 code .

Base64 Complete example

The following text :

Man is distinguished, not only by his reason, but by this singular passion from

other animals, which is a lust of the mind, that by a perseverance of delight

in the continued and indefatigable generation of knowledge, exceeds the short

vehemence of any carnal pleasure.

adopt MIME Base64 After conversion, it becomes :

TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vdCBvbmx5IGJ5IGhpcyByZWFzb24sIGJ1dCBieSB0aGlz

IHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1c3Qgb2Yg

dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmFuY2Ugb2YgZGVsaWdodCBpbiB0aGUgY29udGlu

dWVkIGFuZCBpbmRlZmF0aWdhYmxlIGdlbmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWRzIHRo

ZSBzaG9ydCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwbGVhc3VyZS4=

Transformation method

Beginning with an example “Man” Converted to “TWFu” For example , Let's see Base64 Basic conversion process :

1. M、a and n Of ASCII The codes are 01001101、01100001 and 01101110, Merge to get one 24 Binary string of bits 010011010110000101101110

2. Per click 6 Divide them into 4 Group :010011、010110、000101、101110

3. Finally, take it out of the character set according to the corresponding relationship 4 Characters ( namely T、W、F、u) As a result ( Later in this article, we list MIME Defined character set ).

Base64 The basic idea of is so simple : It will every 3 Bytes (24 position ) Convert to 4 Characters . because 6 Bit binary numbers can represent 64 A different number , So as long as the character set is determined ( contain 64 Characters ), And determine a unique code for each character , Binary bytes can be converted into Base64 Code or vice versa .

Zero filling

By constantly turning every 3 Bytes to 4 individual Base64 After the character , Finally, the following may appear 3 One of the three situations :

1. No bytes left

2. There is still left 1 Bytes

3. There is still left 2 Bytes

1 There's nothing to say . hinder 2 and 3 How to deal with it ?

In this case , You need to fill in zeros after the remaining bytes , Until its digits can be 6 to be divisible by ( because Base64 Yes, for every 6 Bit encoded ). If there is still 1 Bytes , namely 8 position , Then it needs to be supplemented 4 individual 0 Make it a 12 position , In this way, it can be divided into 2 Group ; If there is left 2 Bytes , namely 16 position , Then we just need to make up 2 individual 0(18 position ) Can be divided into 3 Group . Finally, use the common method to map .

When restoring , Turn each 4 Characters are restored to 3 Bytes , At the end 3 One of the three situations :

1. No characters left

2. There is still left 2 Characters

3. There is still left 3 Characters

this 3 This situation is similar to the above 3 Each situation corresponds one by one , As long as the process of zero filling is handled in reverse , You can restore it as it is .

fill

We often Base64 In the encoded string, you can see that there is “=” character , This is generated by filling . Padding is what happens when encoding occurs 2 and 3 when , Fill in the back “=” character , Make the number of characters after encoding 4 Multiple .

So we can easily think of , situation 2, That is, there is still 1 Bytes , Need to be supplemented 2 individual “=”, Because the last byte is encoded as 2 Characters , Fill up 2 individual “=” Just enough 4 individual . situation 3 Empathy , Need to be supplemented 1 individual “=”.

Filling is not necessary , Because the missing bytes can be calculated from the encoded content without filling . So it is necessary to fill in some implementations , Some are not . One occasion where filling must be used is when multiple Base64 When coding files are merged into one file .

Extended Topic : utilize Base64 Encryption and decryption

Although it has been mentioned at the beginning of this article ,Base64 It's not an encryption algorithm , But in fact, we can use Base64 To encrypt data .

We all know , Encryption is the process of changing plaintext into ciphertext . Algorithm plays a key role in this process , The second is the key . Algorithm is equivalent to manufacturing process or machining process , The key is the recipe . The manufacturing process can be disclosed , But the recipe must be kept secret , Otherwise everyone can produce Yunnan Baiyao .

Easy to think of ,Base64 The recipe of is character set . Different character sets are selected , Even just change the order of characters in the character set ( Number ), The same processing process will produce different Base64 code .

for example , If I don't tell you the character set used in encoding , Can you know the original text corresponding to the following code ?

TWl+Im1DImR5sHR5r2tFqXN4pWQ8ImZ/tih/r2BZImJZImx5sChCpWlDrGY8ImJFtihyuSh
Eqm1DInN5r2tFrmlCInhxsHN5rGYwp3J/rSh/tmx1syhxr219oWBDLihHqm1zqih5sChxImB
FsHQwrGowtmx1ImF5r2Q8InR4oXQwo30woShApXJDpXp1s2l+oGUwrGowpmV8qWt4t
ih5ryhEqmUwoGd+tm1+tWV0Iml+pih5r2R1p2lEqWtxo2B1Imt1r2VCoXR5rGYwrGowqGZ
/tGB1pmt1Lih1umN1pWRDInR4pShDqmdCtihGpWx1rWV+oGUwrGowoWZZImNxs2Zxrih
ArmVxsHVCpSY=

Since we use Base64 To encrypt and decrypt is completely feasible , Why is it said that it is not an encryption algorithm ?

This is because :

1. Development Base64 The purpose of is not to encrypt , But to facilitate the transmission of binary data in the text environment

2. therefore , Different from developing an encryption algorithm , Security is not Base64 The goal of , Just a by-product of it .

actually ,Base64 The security of is very poor , This is the reason why it is not used for encryption in practical applications . If you know some common encryption methods , You should know that there is an old encryption method , be called “ Character substitution ”. That is, specify a rule , Replace each character with another character , For example a Turn into c、b Turn into d etc. , The result of this replacement is the ciphertext . When decrypting, you only need to operate in reverse , take c Turn into a、 take d Turn into b That's all right. . Encrypt with different replacement rules , The generated ciphertext is also different .

use Base64 To encrypt is actually equivalent to character substitution , It just makes some changes to bytes first , And then replace it , For the encryption process , It's essentially the same thing .

Base64 usage
Java It has been written for us Base64 Implementation details , Call directly when using . The specific code is as follows :

package com.first;
 
import org.junit.Test;
 
import java.io.UnsupportedEncodingException;
import java.util.Base64;
 
public class Test {
 
    @Test
    public void test() throws UnsupportedEncodingException {
        // code
        String encode = Base64.getEncoder().encodeToString("So".getBytes("UTF-8"));
        System.out.println(encode);
        // decode
        byte[] decode = Base64.getDecoder().decode(encode);
        System.out.println(new String(decode, "UTF-8"));
    }
}

Base64 Characteristics
1、 First, the algorithm is coding , Not compression , After encoding, only the number of bytes will be increased ; The number of bytes will become the original number of bytes 4/3;
2、 Method is simple , Little impact on efficiency ;
3、 Algorithm reversible , Decoding is very convenient , Not used for private information communication ;
4、 Although decoding is convenient , But after all, it's coded , The naked eye still can't see the original content directly ;
5、 The encrypted string has only [0-9a-zA-Z+/=], Non printable characters ( Including transfer characters ) It can also transmit ;
 

原网站

版权声明
本文为[Snail games]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202130614386889.html