当前位置:网站首页>Base64 encoding
Base64 encoding
2022-07-07 14:28:00 【Snail games】
Basic concepts
Base64 The term was originally used in “MIME Content transfer coding specification ” Proposed in .Base64 It's not an encryption algorithm , Although the encoded string looks a bit encrypted . It's actually a kind of “ Binary to text ” The coding method of , It can convert any given binary data ( mapping ) by ASCII String form , So that binary data can be transferred smoothly in a text only environment . For example, support MIME Email app for , Or need to be in XML Store complex data in ( For example, pictures ) when .
Base64 Is a kind of use 64 A method of representing arbitrary binary data with characters . It's a coding method , Instead of encryption . It converts binary data into 64 individual “ Printable characters ”, Completed the data in HTTP Transmission over the Protocol .
Under what circumstances will we use Base64 Well ?Base64 Generally used in HTTP Transfer binary data under the protocol , because HTTP The protocol is hypertext , So in HTTP To transmit binary data under the protocol, we need to convert binary data into character data . However, direct conversion is not possible . You can only transmit characters over the Internet .
What are printable characters ? stay ASCII The code says ,0-31、127 this 33 Characters are control characters ,32-126 this 95 Characters are printable ( See ASCII Code comparison table ), That is to say, network transmission can only transmit this 95 Characters , Characters that are not in this range cannot be transferred . So how can I transfer other characters ? One way is to use Base64.
Base64, Is the use of 64 A printable character to represent binary data . this 64 Characters include upper and lower case letters 、 Numbers 、+ and /, There are also special characters used to fill gaps =.
Be careful : because base64 Coding used 8 Bit character to represent... In the message 6 bits , therefore base64 The encoded string is approximately larger than the original value 33%.
Base64 Encoding table
Code value | character | Code value | character | Code value | character | Code value | character | Code value | character | Code value | character | Code value | character | Code value | character |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | A | 8 | I | 16 | Q | 24 | Y | 32 | g | 40 | o | 48 | w | 56 | 4 |
1 | B | 9 | J | 17 | R | 25 | Z | 33 | h | 41 | p | 49 | x | 57 | 5 |
2 | C | 10 | K | 18 | S | 26 | a | 34 | i | 42 | q | 50 | y | 58 | 6 |
3 | D | 11 | L | 19 | T | 27 | b | 35 | j | 43 | r | 51 | z | 59 | 7 |
4 | E | 12 | M | 20 | U | 28 | c | 36 | k | 44 | s | 52 | 0 | 60 | 8 |
5 | F | 13 | N | 21 | V | 29 | d | 37 | l | 45 | t | 53 | 1 | 61 | 9 |
6 | G | 14 | O | 22 | W | 30 | e | 38 | m | 46 | u | 54 | 2 | 62 | + |
7 | H | 15 | P | 23 | X | 31 | f | 39 | n | 47 | v | 55 | 3 | 63 | / |
in other words , At most if the index is converted to the corresponding binary data 6 individual Bit. However ASCII Code needs 8 individual Bit To express , So how to use 6 individual Bit To express 8 individual Bit What about the data? ?6 individual Bit Of course, you can't store 8 individual Bit The data of , however 4×6 individual Bit Can be stored 3×8 individual Bit The data of ! As shown in the following table :
You can see “Son” adopt Base64 The code was converted to “U29u”. This is just a good situation ,3 individual ASCII The character just translates to the corresponding 4 individual Base64 character . however , When the number of characters to be converted is not 3 What to do in the case of multiple of ?Base64 Regulations , When the character to be converted is not 3 A multiple of the , Make up for all 0 It's a good way 3 Multiple , The details are shown in the table below :
Every time 6 individual Bit For a group , The first group is converted to characters “U”, Fill in at the end of the second group 4 individual 0 Convert to character “w”. The rest of the use “=” replace . That is, the characters “S” adopt Base64 After coding is “Uw==”. This is it. Base64 The coding process .
If the binary data to be encoded is not 3 Multiple , In the end, there will be 1 Or 2 What about a byte ?Base64 use \x00 Byte after complement at the end , Add... At the end of the code 1 Or 2 individual = Number , Indicates how many bytes are filled , When decoding , Will automatically remove .
This is the total number of bits in bytes, not 6 In the case of multiples of , When the rest 4 When a , We need to 2 individual = Come together 8 Multiple ; When the rest is 2 When a , We need to make up 1 individual = Come together 8 Multiple .
To achieve Base64, First of all, we need to choose the appropriate 64 Characters form a character set . A general rule is to choose from some common character set 64 Printable characters , In this way, data loss during transmission can be avoided ( Unprintable characters may be treated as special characters during transmission , Which leads to loss ). for example ,MIME Of Base64 The implementation uses capital letters 、 Lowercase letters and 0~9 As the first 62 Characters . Other implementations usually follow MIME In this way , And only in the end 2 Different characters , for example UTF-7 code .
Base64 Complete example
The following text :
Man is distinguished, not only by his reason, but by this singular passion from
other animals, which is a lust of the mind, that by a perseverance of delight
in the continued and indefatigable generation of knowledge, exceeds the short
vehemence of any carnal pleasure.
adopt MIME Base64 After conversion, it becomes :
TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vdCBvbmx5IGJ5IGhpcyByZWFzb24sIGJ1dCBieSB0aGlz
IHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1c3Qgb2Yg
dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmFuY2Ugb2YgZGVsaWdodCBpbiB0aGUgY29udGlu
dWVkIGFuZCBpbmRlZmF0aWdhYmxlIGdlbmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWRzIHRo
ZSBzaG9ydCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwbGVhc3VyZS4=
Transformation method |
Beginning with an example “Man” Converted to “TWFu” For example , Let's see Base64 Basic conversion process :
1. M、a and n Of ASCII The codes are 01001101、01100001 and 01101110, Merge to get one 24 Binary string of bits 010011010110000101101110
2. Per click 6 Divide them into 4 Group :010011、010110、000101、101110
3. Finally, take it out of the character set according to the corresponding relationship 4 Characters ( namely T、W、F、u) As a result ( Later in this article, we list MIME Defined character set ).
Base64 The basic idea of is so simple : It will every 3 Bytes (24 position ) Convert to 4 Characters . because 6 Bit binary numbers can represent 64 A different number , So as long as the character set is determined ( contain 64 Characters ), And determine a unique code for each character , Binary bytes can be converted into Base64 Code or vice versa .
Zero filling |
By constantly turning every 3 Bytes to 4 individual Base64 After the character , Finally, the following may appear 3 One of the three situations :
1. No bytes left
2. There is still left 1 Bytes
3. There is still left 2 Bytes
1 There's nothing to say . hinder 2 and 3 How to deal with it ?
In this case , You need to fill in zeros after the remaining bytes , Until its digits can be 6 to be divisible by ( because Base64 Yes, for every 6 Bit encoded ). If there is still 1 Bytes , namely 8 position , Then it needs to be supplemented 4 individual 0 Make it a 12 position , In this way, it can be divided into 2 Group ; If there is left 2 Bytes , namely 16 position , Then we just need to make up 2 individual 0(18 position ) Can be divided into 3 Group . Finally, use the common method to map .
When restoring , Turn each 4 Characters are restored to 3 Bytes , At the end 3 One of the three situations :
1. No characters left
2. There is still left 2 Characters
3. There is still left 3 Characters
this 3 This situation is similar to the above 3 Each situation corresponds one by one , As long as the process of zero filling is handled in reverse , You can restore it as it is .
fill |
We often Base64 In the encoded string, you can see that there is “=” character , This is generated by filling . Padding is what happens when encoding occurs 2 and 3 when , Fill in the back “=” character , Make the number of characters after encoding 4 Multiple .
So we can easily think of , situation 2, That is, there is still 1 Bytes , Need to be supplemented 2 individual “=”, Because the last byte is encoded as 2 Characters , Fill up 2 individual “=” Just enough 4 individual . situation 3 Empathy , Need to be supplemented 1 individual “=”.
Filling is not necessary , Because the missing bytes can be calculated from the encoded content without filling . So it is necessary to fill in some implementations , Some are not . One occasion where filling must be used is when multiple Base64 When coding files are merged into one file .
Extended Topic : utilize Base64 Encryption and decryption |
Although it has been mentioned at the beginning of this article ,Base64 It's not an encryption algorithm , But in fact, we can use Base64 To encrypt data .
We all know , Encryption is the process of changing plaintext into ciphertext . Algorithm plays a key role in this process , The second is the key . Algorithm is equivalent to manufacturing process or machining process , The key is the recipe . The manufacturing process can be disclosed , But the recipe must be kept secret , Otherwise everyone can produce Yunnan Baiyao .
Easy to think of ,Base64 The recipe of is character set . Different character sets are selected , Even just change the order of characters in the character set ( Number ), The same processing process will produce different Base64 code .
for example , If I don't tell you the character set used in encoding , Can you know the original text corresponding to the following code ?
TWl+Im1DImR5sHR5r2tFqXN4pWQ8ImZ/tih/r2BZImJZImx5sChCpWlDrGY8ImJFtihyuSh
Eqm1DInN5r2tFrmlCInhxsHN5rGYwp3J/rSh/tmx1syhxr219oWBDLihHqm1zqih5sChxImB
FsHQwrGowtmx1ImF5r2Q8InR4oXQwo30woShApXJDpXp1s2l+oGUwrGowpmV8qWt4t
ih5ryhEqmUwoGd+tm1+tWV0Iml+pih5r2R1p2lEqWtxo2B1Imt1r2VCoXR5rGYwrGowqGZ
/tGB1pmt1Lih1umN1pWRDInR4pShDqmdCtihGpWx1rWV+oGUwrGowoWZZImNxs2Zxrih
ArmVxsHVCpSY=
Since we use Base64 To encrypt and decrypt is completely feasible , Why is it said that it is not an encryption algorithm ?
This is because :
1. Development Base64 The purpose of is not to encrypt , But to facilitate the transmission of binary data in the text environment
2. therefore , Different from developing an encryption algorithm , Security is not Base64 The goal of , Just a by-product of it .
actually ,Base64 The security of is very poor , This is the reason why it is not used for encryption in practical applications . If you know some common encryption methods , You should know that there is an old encryption method , be called “ Character substitution ”. That is, specify a rule , Replace each character with another character , For example a Turn into c、b Turn into d etc. , The result of this replacement is the ciphertext . When decrypting, you only need to operate in reverse , take c Turn into a、 take d Turn into b That's all right. . Encrypt with different replacement rules , The generated ciphertext is also different .
use Base64 To encrypt is actually equivalent to character substitution , It just makes some changes to bytes first , And then replace it , For the encryption process , It's essentially the same thing .
Base64 usage
Java It has been written for us Base64 Implementation details , Call directly when using . The specific code is as follows :
package com.first;
import org.junit.Test;
import java.io.UnsupportedEncodingException;
import java.util.Base64;
public class Test {
@Test
public void test() throws UnsupportedEncodingException {
// code
String encode = Base64.getEncoder().encodeToString("So".getBytes("UTF-8"));
System.out.println(encode);
// decode
byte[] decode = Base64.getDecoder().decode(encode);
System.out.println(new String(decode, "UTF-8"));
}
}
Base64 Characteristics
1、 First, the algorithm is coding , Not compression , After encoding, only the number of bytes will be increased ; The number of bytes will become the original number of bytes 4/3;
2、 Method is simple , Little impact on efficiency ;
3、 Algorithm reversible , Decoding is very convenient , Not used for private information communication ;
4、 Although decoding is convenient , But after all, it's coded , The naked eye still can't see the original content directly ;
5、 The encrypted string has only [0-9a-zA-Z+/=], Non printable characters ( Including transfer characters ) It can also transmit ;
边栏推荐
- 最长上升子序列模型 AcWing 482. 合唱队形
- Excuse me, does PTS have a good plan for database pressure measurement?
- Transferring files between VMware and host
- LeetCode 648. Word replacement
- CVPR2022 | 医学图像分析中基于频率注入的后门攻击
- Leetcode——剑指 Offer 05. 替换空格
- 一款你不容错过的Laravel后台管理扩展包 —— Voyager
- Parsing of XML files
- Substance painter notes: settings for multi display and multi-resolution displays
- 低代码平台中的数据连接方式(下)
猜你喜欢
LeetCode 648. 单词替换
STM32CubeMX,68套组件,遵循10条开源协议
Mrs offline data analysis: process OBS data through Flink job
Substance Painter笔记:多显示器且多分辨率显示器时的设置
Hands on Teaching: XML modeling
PyTorch模型训练实战技巧,突破速度瓶颈
Equipment failure prediction machine failure early warning mechanical equipment vibration monitoring machine failure early warning CNC vibration wireless monitoring equipment abnormal early warning
JS get the current time, month, day, year, and the uniapp location applet opens the map to select the location
AutoCAD - how to input angle dimensions and CAD diameter symbols greater than 180 degrees?
MRS离线数据分析:通过Flink作业处理OBS数据
随机推荐
Parsing of XML files
昇腾体验官第五期随手记I
Excusez - moi, l'exécution a été réussie lors de l'utilisation des données de puits SQL Flink à Kafka, mais il n'y a pas de nombre dans Kafka
JS get the current time, month, day, year, and the uniapp location applet opens the map to select the location
SAKT方法部分介绍
Ian Goodfellow, the inventor of Gan, officially joined deepmind as research scientist
PERT图(工程网络图)
一款你不容错过的Laravel后台管理扩展包 —— Voyager
Démontage de la fonction du système multi - Merchant Mall 01 - architecture du produit
Substance painter notes: settings for multi display and multi-resolution displays
Environment configuration
NDK beginner's study (1)
Oracle Linux 9.0 正式发布
PLC:自动纠正数据集噪声,来洗洗数据集吧 | ICLR 2021 Spotlight
First choice for stock account opening, lowest Commission for stock trading account opening, is online account opening safe
一文读懂数仓中的pg_stat
2022PAGC 金帆奖 | 融云荣膺「年度杰出产品技术服务商」
C # switch pages through frame and page
LeetCode每日一题(636. Exclusive Time of Functions)
Oracle non automatic submission solution