当前位置:网站首页>Base64 encoding
Base64 encoding
2022-07-07 14:28:00 【Snail games】
Basic concepts
Base64 The term was originally used in “MIME Content transfer coding specification ” Proposed in .Base64 It's not an encryption algorithm , Although the encoded string looks a bit encrypted . It's actually a kind of “ Binary to text ” The coding method of , It can convert any given binary data ( mapping ) by ASCII String form , So that binary data can be transferred smoothly in a text only environment . For example, support MIME Email app for , Or need to be in XML Store complex data in ( For example, pictures ) when .
Base64 Is a kind of use 64 A method of representing arbitrary binary data with characters . It's a coding method , Instead of encryption . It converts binary data into 64 individual “ Printable characters ”, Completed the data in HTTP Transmission over the Protocol .
Under what circumstances will we use Base64 Well ?Base64 Generally used in HTTP Transfer binary data under the protocol , because HTTP The protocol is hypertext , So in HTTP To transmit binary data under the protocol, we need to convert binary data into character data . However, direct conversion is not possible . You can only transmit characters over the Internet .
What are printable characters ? stay ASCII The code says ,0-31、127 this 33 Characters are control characters ,32-126 this 95 Characters are printable ( See ASCII Code comparison table ), That is to say, network transmission can only transmit this 95 Characters , Characters that are not in this range cannot be transferred . So how can I transfer other characters ? One way is to use Base64.
Base64, Is the use of 64 A printable character to represent binary data . this 64 Characters include upper and lower case letters 、 Numbers 、+ and /, There are also special characters used to fill gaps =.
Be careful : because base64 Coding used 8 Bit character to represent... In the message 6 bits , therefore base64 The encoded string is approximately larger than the original value 33%.
Base64 Encoding table
Code value | character | Code value | character | Code value | character | Code value | character | Code value | character | Code value | character | Code value | character | Code value | character |
0 | A | 8 | I | 16 | Q | 24 | Y | 32 | g | 40 | o | 48 | w | 56 | 4 |
1 | B | 9 | J | 17 | R | 25 | Z | 33 | h | 41 | p | 49 | x | 57 | 5 |
2 | C | 10 | K | 18 | S | 26 | a | 34 | i | 42 | q | 50 | y | 58 | 6 |
3 | D | 11 | L | 19 | T | 27 | b | 35 | j | 43 | r | 51 | z | 59 | 7 |
4 | E | 12 | M | 20 | U | 28 | c | 36 | k | 44 | s | 52 | 0 | 60 | 8 |
5 | F | 13 | N | 21 | V | 29 | d | 37 | l | 45 | t | 53 | 1 | 61 | 9 |
6 | G | 14 | O | 22 | W | 30 | e | 38 | m | 46 | u | 54 | 2 | 62 | + |
7 | H | 15 | P | 23 | X | 31 | f | 39 | n | 47 | v | 55 | 3 | 63 | / |
in other words , At most if the index is converted to the corresponding binary data 6 individual Bit. However ASCII Code needs 8 individual Bit To express , So how to use 6 individual Bit To express 8 individual Bit What about the data? ?6 individual Bit Of course, you can't store 8 individual Bit The data of , however 4×6 individual Bit Can be stored 3×8 individual Bit The data of ! As shown in the following table :
You can see “Son” adopt Base64 The code was converted to “U29u”. This is just a good situation ,3 individual ASCII The character just translates to the corresponding 4 individual Base64 character . however , When the number of characters to be converted is not 3 What to do in the case of multiple of ?Base64 Regulations , When the character to be converted is not 3 A multiple of the , Make up for all 0 It's a good way 3 Multiple , The details are shown in the table below :
Every time 6 individual Bit For a group , The first group is converted to characters “U”, Fill in at the end of the second group 4 individual 0 Convert to character “w”. The rest of the use “=” replace . That is, the characters “S” adopt Base64 After coding is “Uw==”. This is it. Base64 The coding process .
If the binary data to be encoded is not 3 Multiple , In the end, there will be 1 Or 2 What about a byte ?Base64 use \x00 Byte after complement at the end , Add... At the end of the code 1 Or 2 individual = Number , Indicates how many bytes are filled , When decoding , Will automatically remove .
This is the total number of bits in bytes, not 6 In the case of multiples of , When the rest 4 When a , We need to 2 individual = Come together 8 Multiple ; When the rest is 2 When a , We need to make up 1 individual = Come together 8 Multiple .
To achieve Base64, First of all, we need to choose the appropriate 64 Characters form a character set . A general rule is to choose from some common character set 64 Printable characters , In this way, data loss during transmission can be avoided ( Unprintable characters may be treated as special characters during transmission , Which leads to loss ). for example ,MIME Of Base64 The implementation uses capital letters 、 Lowercase letters and 0~9 As the first 62 Characters . Other implementations usually follow MIME In this way , And only in the end 2 Different characters , for example UTF-7 code .
Base64 Complete example
The following text :
Man is distinguished, not only by his reason, but by this singular passion from
other animals, which is a lust of the mind, that by a perseverance of delight
in the continued and indefatigable generation of knowledge, exceeds the short
vehemence of any carnal pleasure.
adopt MIME Base64 After conversion, it becomes :
Transformation method |
Beginning with an example “Man” Converted to “TWFu” For example , Let's see Base64 Basic conversion process :
1. M、a and n Of ASCII The codes are 01001101、01100001 and 01101110, Merge to get one 24 Binary string of bits 010011010110000101101110
2. Per click 6 Divide them into 4 Group :010011、010110、000101、101110
3. Finally, take it out of the character set according to the corresponding relationship 4 Characters ( namely T、W、F、u) As a result ( Later in this article, we list MIME Defined character set ).
Base64 The basic idea of is so simple : It will every 3 Bytes (24 position ) Convert to 4 Characters . because 6 Bit binary numbers can represent 64 A different number , So as long as the character set is determined ( contain 64 Characters ), And determine a unique code for each character , Binary bytes can be converted into Base64 Code or vice versa .
Zero filling |
By constantly turning every 3 Bytes to 4 individual Base64 After the character , Finally, the following may appear 3 One of the three situations :
1. No bytes left
2. There is still left 1 Bytes
3. There is still left 2 Bytes
1 There's nothing to say . hinder 2 and 3 How to deal with it ?
In this case , You need to fill in zeros after the remaining bytes , Until its digits can be 6 to be divisible by ( because Base64 Yes, for every 6 Bit encoded ). If there is still 1 Bytes , namely 8 position , Then it needs to be supplemented 4 individual 0 Make it a 12 position , In this way, it can be divided into 2 Group ; If there is left 2 Bytes , namely 16 position , Then we just need to make up 2 individual 0(18 position ) Can be divided into 3 Group . Finally, use the common method to map .
When restoring , Turn each 4 Characters are restored to 3 Bytes , At the end 3 One of the three situations :
1. No characters left
2. There is still left 2 Characters
3. There is still left 3 Characters
this 3 This situation is similar to the above 3 Each situation corresponds one by one , As long as the process of zero filling is handled in reverse , You can restore it as it is .
fill |
We often Base64 In the encoded string, you can see that there is “=” character , This is generated by filling . Padding is what happens when encoding occurs 2 and 3 when , Fill in the back “=” character , Make the number of characters after encoding 4 Multiple .
So we can easily think of , situation 2, That is, there is still 1 Bytes , Need to be supplemented 2 individual “=”, Because the last byte is encoded as 2 Characters , Fill up 2 individual “=” Just enough 4 individual . situation 3 Empathy , Need to be supplemented 1 individual “=”.
Filling is not necessary , Because the missing bytes can be calculated from the encoded content without filling . So it is necessary to fill in some implementations , Some are not . One occasion where filling must be used is when multiple Base64 When coding files are merged into one file .
Extended Topic : utilize Base64 Encryption and decryption |
Although it has been mentioned at the beginning of this article ,Base64 It's not an encryption algorithm , But in fact, we can use Base64 To encrypt data .
We all know , Encryption is the process of changing plaintext into ciphertext . Algorithm plays a key role in this process , The second is the key . Algorithm is equivalent to manufacturing process or machining process , The key is the recipe . The manufacturing process can be disclosed , But the recipe must be kept secret , Otherwise everyone can produce Yunnan Baiyao .
Easy to think of ,Base64 The recipe of is character set . Different character sets are selected , Even just change the order of characters in the character set ( Number ), The same processing process will produce different Base64 code .
for example , If I don't tell you the character set used in encoding , Can you know the original text corresponding to the following code ?
Since we use Base64 To encrypt and decrypt is completely feasible , Why is it said that it is not an encryption algorithm ?
This is because :
1. Development Base64 The purpose of is not to encrypt , But to facilitate the transmission of binary data in the text environment
2. therefore , Different from developing an encryption algorithm , Security is not Base64 The goal of , Just a by-product of it .
actually ,Base64 The security of is very poor , This is the reason why it is not used for encryption in practical applications . If you know some common encryption methods , You should know that there is an old encryption method , be called “ Character substitution ”. That is, specify a rule , Replace each character with another character , For example a Turn into c、b Turn into d etc. , The result of this replacement is the ciphertext . When decrypting, you only need to operate in reverse , take c Turn into a、 take d Turn into b That's all right. . Encrypt with different replacement rules , The generated ciphertext is also different .
use Base64 To encrypt is actually equivalent to character substitution , It just makes some changes to bytes first , And then replace it , For the encryption process , It's essentially the same thing .
Base64 usage
Java It has been written for us Base64 Implementation details , Call directly when using . The specific code is as follows :
package com.first;
import org.junit.Test;
import java.io.UnsupportedEncodingException;
import java.util.Base64;
public class Test {
public void test() throws UnsupportedEncodingException {
// code
String encode = Base64.getEncoder().encodeToString("So".getBytes("UTF-8"));
// decode
byte[] decode = Base64.getDecoder().decode(encode);
System.out.println(new String(decode, "UTF-8"));
Base64 Characteristics
1、 First, the algorithm is coding , Not compression , After encoding, only the number of bytes will be increased ; The number of bytes will become the original number of bytes 4/3;
2、 Method is simple , Little impact on efficiency ;
3、 Algorithm reversible , Decoding is very convenient , Not used for private information communication ;
4、 Although decoding is convenient , But after all, it's coded , The naked eye still can't see the original content directly ;
5、 The encrypted string has only [0-9a-zA-Z+/=], Non printable characters ( Including transfer characters ) It can also transmit ;
- 数据湖(九):Iceberg特点详述和数据类型
- LeetCode每日一题(636. Exclusive Time of Functions)
- C # use TCP protocol to establish connection
- [network security] SQL injection syntax summary
- Hands on Teaching: XML modeling
- 一个简单LEGv8处理器的Verilog实现【四】【单周期实现基础知识及模块设计讲解】
- Equipment failure prediction machine failure early warning mechanical equipment vibration monitoring machine failure early warning CNC vibration wireless monitoring equipment abnormal early warning
- GAN发明者Ian Goodfellow正式加入DeepMind,任Research Scientist
- 请问,PTS对数据库压测有好方案么?
- Horizontal of libsgm_ path_ Interpretation of aggregation program
最长上升子序列模型 AcWing 1012. 友好城市
Horizontal of libsgm_ path_ Interpretation of aggregation program
UML 状态图
Selenium Library
UML state diagram
[Reading stereo matching papers] [III] ints
潘多拉 IOT 开发板学习(HAL 库)—— 实验12 RTC实时时钟实验(学习笔记)
Verilog implementation of a simple legv8 processor [4] [explanation of basic knowledge and module design of single cycle implementation]
CSMA/CD 载波监听多点接入/碰撞检测协议
UML state diagram
Data flow diagram, data dictionary
Verilog implementation of a simple legv8 processor [4] [explanation of basic knowledge and module design of single cycle implementation]
Navigation - are you sure you want to take a look at such an easy-to-use navigation framework?
The longest ascending subsequence model acwing 1014 Mountaineering
[network security] SQL injection syntax summary
杭电oj2092 整数解
[Reading stereo matching papers] [III] ints
c#通过frame 和 page 切换页面
Hangdian oj2092 integer solution
UML 顺序图(时序图)
【愚公系列】2022年7月 Go教学课程 005-变量
Ian Goodfellow, the inventor of Gan, officially joined deepmind as research scientist