当前位置:网站首页>Base64 encoding
Base64 encoding
2022-07-07 14:28:00 【Snail games】
Basic concepts
Base64 The term was originally used in “MIME Content transfer coding specification ” Proposed in .Base64 It's not an encryption algorithm , Although the encoded string looks a bit encrypted . It's actually a kind of “ Binary to text ” The coding method of , It can convert any given binary data ( mapping ) by ASCII String form , So that binary data can be transferred smoothly in a text only environment . For example, support MIME Email app for , Or need to be in XML Store complex data in ( For example, pictures ) when .
Base64 Is a kind of use 64 A method of representing arbitrary binary data with characters . It's a coding method , Instead of encryption . It converts binary data into 64 individual “ Printable characters ”, Completed the data in HTTP Transmission over the Protocol .
Under what circumstances will we use Base64 Well ?Base64 Generally used in HTTP Transfer binary data under the protocol , because HTTP The protocol is hypertext , So in HTTP To transmit binary data under the protocol, we need to convert binary data into character data . However, direct conversion is not possible . You can only transmit characters over the Internet .
What are printable characters ? stay ASCII The code says ,0-31、127 this 33 Characters are control characters ,32-126 this 95 Characters are printable ( See ASCII Code comparison table ), That is to say, network transmission can only transmit this 95 Characters , Characters that are not in this range cannot be transferred . So how can I transfer other characters ? One way is to use Base64.
Base64, Is the use of 64 A printable character to represent binary data . this 64 Characters include upper and lower case letters 、 Numbers 、+ and /, There are also special characters used to fill gaps =.
Be careful : because base64 Coding used 8 Bit character to represent... In the message 6 bits , therefore base64 The encoded string is approximately larger than the original value 33%.
Base64 Encoding table
| Code value | character | Code value | character | Code value | character | Code value | character | Code value | character | Code value | character | Code value | character | Code value | character |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | A | 8 | I | 16 | Q | 24 | Y | 32 | g | 40 | o | 48 | w | 56 | 4 |
| 1 | B | 9 | J | 17 | R | 25 | Z | 33 | h | 41 | p | 49 | x | 57 | 5 |
| 2 | C | 10 | K | 18 | S | 26 | a | 34 | i | 42 | q | 50 | y | 58 | 6 |
| 3 | D | 11 | L | 19 | T | 27 | b | 35 | j | 43 | r | 51 | z | 59 | 7 |
| 4 | E | 12 | M | 20 | U | 28 | c | 36 | k | 44 | s | 52 | 0 | 60 | 8 |
| 5 | F | 13 | N | 21 | V | 29 | d | 37 | l | 45 | t | 53 | 1 | 61 | 9 |
| 6 | G | 14 | O | 22 | W | 30 | e | 38 | m | 46 | u | 54 | 2 | 62 | + |
| 7 | H | 15 | P | 23 | X | 31 | f | 39 | n | 47 | v | 55 | 3 | 63 | / |
in other words , At most if the index is converted to the corresponding binary data 6 individual Bit. However ASCII Code needs 8 individual Bit To express , So how to use 6 individual Bit To express 8 individual Bit What about the data? ?6 individual Bit Of course, you can't store 8 individual Bit The data of , however 4×6 individual Bit Can be stored 3×8 individual Bit The data of ! As shown in the following table :
You can see “Son” adopt Base64 The code was converted to “U29u”. This is just a good situation ,3 individual ASCII The character just translates to the corresponding 4 individual Base64 character . however , When the number of characters to be converted is not 3 What to do in the case of multiple of ?Base64 Regulations , When the character to be converted is not 3 A multiple of the , Make up for all 0 It's a good way 3 Multiple , The details are shown in the table below :

Every time 6 individual Bit For a group , The first group is converted to characters “U”, Fill in at the end of the second group 4 individual 0 Convert to character “w”. The rest of the use “=” replace . That is, the characters “S” adopt Base64 After coding is “Uw==”. This is it. Base64 The coding process .
If the binary data to be encoded is not 3 Multiple , In the end, there will be 1 Or 2 What about a byte ?Base64 use \x00 Byte after complement at the end , Add... At the end of the code 1 Or 2 individual = Number , Indicates how many bytes are filled , When decoding , Will automatically remove .
This is the total number of bits in bytes, not 6 In the case of multiples of , When the rest 4 When a , We need to 2 individual = Come together 8 Multiple ; When the rest is 2 When a , We need to make up 1 individual = Come together 8 Multiple .
To achieve Base64, First of all, we need to choose the appropriate 64 Characters form a character set . A general rule is to choose from some common character set 64 Printable characters , In this way, data loss during transmission can be avoided ( Unprintable characters may be treated as special characters during transmission , Which leads to loss ). for example ,MIME Of Base64 The implementation uses capital letters 、 Lowercase letters and 0~9 As the first 62 Characters . Other implementations usually follow MIME In this way , And only in the end 2 Different characters , for example UTF-7 code .
Base64 Complete example
The following text :
Man is distinguished, not only by his reason, but by this singular passion from
other animals, which is a lust of the mind, that by a perseverance of delight
in the continued and indefatigable generation of knowledge, exceeds the short
vehemence of any carnal pleasure.
adopt MIME Base64 After conversion, it becomes :
TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vdCBvbmx5IGJ5IGhpcyByZWFzb24sIGJ1dCBieSB0aGlz
IHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1c3Qgb2Yg
dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmFuY2Ugb2YgZGVsaWdodCBpbiB0aGUgY29udGlu
dWVkIGFuZCBpbmRlZmF0aWdhYmxlIGdlbmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWRzIHRo
ZSBzaG9ydCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwbGVhc3VyZS4=
| Transformation method |
Beginning with an example “Man” Converted to “TWFu” For example , Let's see Base64 Basic conversion process :
1. M、a and n Of ASCII The codes are 01001101、01100001 and 01101110, Merge to get one 24 Binary string of bits 010011010110000101101110
2. Per click 6 Divide them into 4 Group :010011、010110、000101、101110
3. Finally, take it out of the character set according to the corresponding relationship 4 Characters ( namely T、W、F、u) As a result ( Later in this article, we list MIME Defined character set ).
Base64 The basic idea of is so simple : It will every 3 Bytes (24 position ) Convert to 4 Characters . because 6 Bit binary numbers can represent 64 A different number , So as long as the character set is determined ( contain 64 Characters ), And determine a unique code for each character , Binary bytes can be converted into Base64 Code or vice versa .
| Zero filling |
By constantly turning every 3 Bytes to 4 individual Base64 After the character , Finally, the following may appear 3 One of the three situations :
1. No bytes left
2. There is still left 1 Bytes
3. There is still left 2 Bytes
1 There's nothing to say . hinder 2 and 3 How to deal with it ?
In this case , You need to fill in zeros after the remaining bytes , Until its digits can be 6 to be divisible by ( because Base64 Yes, for every 6 Bit encoded ). If there is still 1 Bytes , namely 8 position , Then it needs to be supplemented 4 individual 0 Make it a 12 position , In this way, it can be divided into 2 Group ; If there is left 2 Bytes , namely 16 position , Then we just need to make up 2 individual 0(18 position ) Can be divided into 3 Group . Finally, use the common method to map .

When restoring , Turn each 4 Characters are restored to 3 Bytes , At the end 3 One of the three situations :
1. No characters left
2. There is still left 2 Characters
3. There is still left 3 Characters
this 3 This situation is similar to the above 3 Each situation corresponds one by one , As long as the process of zero filling is handled in reverse , You can restore it as it is .
| fill |
We often Base64 In the encoded string, you can see that there is “=” character , This is generated by filling . Padding is what happens when encoding occurs 2 and 3 when , Fill in the back “=” character , Make the number of characters after encoding 4 Multiple .
So we can easily think of , situation 2, That is, there is still 1 Bytes , Need to be supplemented 2 individual “=”, Because the last byte is encoded as 2 Characters , Fill up 2 individual “=” Just enough 4 individual . situation 3 Empathy , Need to be supplemented 1 individual “=”.

Filling is not necessary , Because the missing bytes can be calculated from the encoded content without filling . So it is necessary to fill in some implementations , Some are not . One occasion where filling must be used is when multiple Base64 When coding files are merged into one file .
| Extended Topic : utilize Base64 Encryption and decryption |
Although it has been mentioned at the beginning of this article ,Base64 It's not an encryption algorithm , But in fact, we can use Base64 To encrypt data .
We all know , Encryption is the process of changing plaintext into ciphertext . Algorithm plays a key role in this process , The second is the key . Algorithm is equivalent to manufacturing process or machining process , The key is the recipe . The manufacturing process can be disclosed , But the recipe must be kept secret , Otherwise everyone can produce Yunnan Baiyao .
Easy to think of ,Base64 The recipe of is character set . Different character sets are selected , Even just change the order of characters in the character set ( Number ), The same processing process will produce different Base64 code .
for example , If I don't tell you the character set used in encoding , Can you know the original text corresponding to the following code ?
TWl+Im1DImR5sHR5r2tFqXN4pWQ8ImZ/tih/r2BZImJZImx5sChCpWlDrGY8ImJFtihyuSh
Eqm1DInN5r2tFrmlCInhxsHN5rGYwp3J/rSh/tmx1syhxr219oWBDLihHqm1zqih5sChxImB
FsHQwrGowtmx1ImF5r2Q8InR4oXQwo30woShApXJDpXp1s2l+oGUwrGowpmV8qWt4t
ih5ryhEqmUwoGd+tm1+tWV0Iml+pih5r2R1p2lEqWtxo2B1Imt1r2VCoXR5rGYwrGowqGZ
/tGB1pmt1Lih1umN1pWRDInR4pShDqmdCtihGpWx1rWV+oGUwrGowoWZZImNxs2Zxrih
ArmVxsHVCpSY=
Since we use Base64 To encrypt and decrypt is completely feasible , Why is it said that it is not an encryption algorithm ?
This is because :
1. Development Base64 The purpose of is not to encrypt , But to facilitate the transmission of binary data in the text environment
2. therefore , Different from developing an encryption algorithm , Security is not Base64 The goal of , Just a by-product of it .
actually ,Base64 The security of is very poor , This is the reason why it is not used for encryption in practical applications . If you know some common encryption methods , You should know that there is an old encryption method , be called “ Character substitution ”. That is, specify a rule , Replace each character with another character , For example a Turn into c、b Turn into d etc. , The result of this replacement is the ciphertext . When decrypting, you only need to operate in reverse , take c Turn into a、 take d Turn into b That's all right. . Encrypt with different replacement rules , The generated ciphertext is also different .
use Base64 To encrypt is actually equivalent to character substitution , It just makes some changes to bytes first , And then replace it , For the encryption process , It's essentially the same thing .
Base64 usage
Java It has been written for us Base64 Implementation details , Call directly when using . The specific code is as follows :
package com.first;
import org.junit.Test;
import java.io.UnsupportedEncodingException;
import java.util.Base64;
public class Test {
@Test
public void test() throws UnsupportedEncodingException {
// code
String encode = Base64.getEncoder().encodeToString("So".getBytes("UTF-8"));
System.out.println(encode);
// decode
byte[] decode = Base64.getDecoder().decode(encode);
System.out.println(new String(decode, "UTF-8"));
}
}
Base64 Characteristics
1、 First, the algorithm is coding , Not compression , After encoding, only the number of bytes will be increased ; The number of bytes will become the original number of bytes 4/3;
2、 Method is simple , Little impact on efficiency ;
3、 Algorithm reversible , Decoding is very convenient , Not used for private information communication ;
4、 Although decoding is convenient , But after all, it's coded , The naked eye still can't see the original content directly ;
5、 The encrypted string has only [0-9a-zA-Z+/=], Non printable characters ( Including transfer characters ) It can also transmit ;
边栏推荐
- libSGM的horizontal_path_aggregation程序解读
- 常用数字信号编码之反向不归零码码、曼彻斯特编码、差分曼彻斯特编码
- Docker deploy Oracle
- 最长上升子序列模型 AcWing 1012. 友好城市
- C # use TCP protocol to establish connection
- IP address home location query
- First choice for stock account opening, lowest Commission for stock trading account opening, is online account opening safe
- Analysis of arouter
- js 获取当前时间 年月日,uniapp定位 小程序打开地图选择地点
- Is the spare money in your hand better to fry stocks or buy financial products?
猜你喜欢

Cvpr2022 | backdoor attack based on frequency injection in medical image analysis

Reverse non return to zero code, Manchester code and differential Manchester code of common digital signal coding

Parsing of XML files

一个简单LEGv8处理器的Verilog实现【四】【单周期实现基础知识及模块设计讲解】

用例图

低代码平台中的数据连接方式(下)

JS get the current time, month, day, year, and the uniapp location applet opens the map to select the location

Mmkv use and principle

Codes de non - retour à zéro inversés, codes Manchester et codes Manchester différentiels couramment utilisés pour le codage des signaux numériques

多商戶商城系統功能拆解01講-產品架構
随机推荐
昇腾体验官第五期随手记I
[AI practice] Application xgboost Xgbregressor builds air quality prediction model (II)
ndk初学习(一)
最长上升子序列模型 AcWing 482. 合唱队形
Navigation - are you sure you want to take a look at such an easy-to-use navigation framework?
The longest ascending subsequence model acwing 1014 Mountaineering
oracle 触发器实现级联更新
Applet directory structure
半小时『直播连麦搭建』动手实战,大学生技术岗位简历加分项get!
PERT图(工程网络图)
课设之百万数据文档存取
leetcode:648. 单词替换【字典树板子 + 寻找若干前缀中的最短符合前缀】
[untitled]
Verilog implementation of a simple legv8 processor [4] [explanation of basic knowledge and module design of single cycle implementation]
Bashrc and profile
libSGM的horizontal_path_aggregation程序解读
Parsing of XML files
EfficientNet模型的完整细节
小米的芯片自研之路
潘多拉 IOT 开发板学习(HAL 库)—— 实验12 RTC实时时钟实验(学习笔记)