当前位置:网站首页>Deep understanding of Base64 underlying principles
Deep understanding of Base64 underlying principles
2022-07-29 03:38:00 【Luo Zhaocheng CSDN】
Base64 Is a common data encoding method , Commonly used for data transmission . For mobile developers , Network requests often use . Yes JSON Familiar students all know ,JSON The serialization tool of does not support byte Array directly into JSON In the data , For this binary data , In dealing with , All need to be done Base64 code , Then put the encoded string into JSON in . From the beginning to now ,Base64 It's a very common function , From the Internet “ Carry ” Base64 Codec tools to use Android SDK The built-in Base64. Although skilled in use , But I didn't understand it deeply Base64 The underlying mechanism of .
Coincidentally , In the near future , I met again Base64 Related matters of , So I went to have a relatively in-depth understanding Base64 Related content of . The following is only for your own learning records .
1. Base64 Definition
Base64 It's based on 64 A representation of binary data that represents three printable characters , In human words, you can call Binary system Data is printed on the console , It can copy and transmit well .
Used in coding 64 Printable characters only need 6 individual Bit You can say : 2 6 = 64 2^6 = 64 26=64
In the computer world ,3 individual Byte Binary data of is equivalent to 24 individual Bit , Therefore, when encoding, you need to correspond 4 individual Base64 To represent .
Base64 The printable character set of contains letters A-Z、a-Z、 Numbers 0-9, That's it 62 Characters , The other two may be different in different systems , We usually use + 、 /, stay URL Safe In the mode of -、_
Base64 clock :

2. Base64 transformation
We will byte Every one of the arrays byte All converted into 8 position bit, Form a string of binary data , Then press 6 Bit by bit , Then every 6 Bits are converted to a bit , Will be converted bit As index , Go to the code table to find the corresponding displayed characters .
The coding example is as follows :

As shown in the figure above , take LZC convert to Base64 The encoded printable string is TFpD. From this example, we can see , Just enough to 3 individual byte Turn into 4 individual byte.
Smart you may have found the existing problems , If you want to encode the original byte[] What should I do if the length of cannot be divided by three ?
In the code table , Except for those used in coding 64 Characters , There is also a padding character at the end : = , Used to fill in insufficient digits .
Illustrate with examples :

3. Base64 encode Code implementation (Java edition )
- Put the existing 3 individual byte
a, b, cCode as Base64 String of
final char[] base64 = {
'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '+', '/'
};
byte a = 'L';
byte b = 'Z';
byte c = 'C';
int index1 = (a >> 2) & 0b00111111;
int index2 = ((a << 4) & 0b00110000) | ((b >> 4) & 0b00001111);
int index3 = ((b << 2) & 0b00111100) | ((c >> 6) & 0b00000011);
int index4 = c & 0b00111111;
byte[] result = new byte[]{
(byte) base64[index1], (byte) base64[index2], (byte) base64[index3], (byte) base64[index4]
};
System.out.println(new String(result));
// TFpD
- Put the existing 2 individual byte
a, bCode as Base64 String of
byte a = 'Z';
byte b = 'C';
int index1 = (a >> 2) & 0b00111111;
int index2 = ((a << 4) & 0b00110000) | ((b >> 4) & 0b00001111);
int index3 = ((b << 2) & 0b00111111);
byte[] result = new byte[]{
(byte) base64[index1], (byte) base64[index2], (byte) base64[index3], '='
};
System.out.println(new String(result));
// WkM=
- Put the existing 1 individual byte
aCode as Base64 String of
byte a = 'L';
int index1 = (a >> 2) & 0b00111111;
int index2 = ((a << 4) & 0b00111111);
byte[] result = new byte[]{
(byte) base64[index1], (byte) base64[index2], '=', '='
};
System.out.println(new String(result));
// TA==
From code , You can see the conversion steps described above . But in calculating index2 as well as index3 When , It's still a little winding , Is there a better way to calculate ?
stay JDK Of Base64 In the implementation of , See a new implementation . stay Java in , One int Number of type 4 individual byte , So you can convert 3 position byte In a int In the data .
As shown in the figure below , take LZC Three byte Put one in int Type variable :

Code implementation :
byte a = 'L';
byte b = 'Z';
byte c = 'C';
int bits = (a & 0xFF) << 16 | (b & 0xFF) << 8 | (c & 0xFF);
// 10011000101101001000011
Get bits after , You can directly perform bit operations , Take out the corresponding 4 individual 6 Bit binary number . The code implementation is as follows :
// 0x3f == 0b00111111
int index1 = (bits >> 18) & 0x3F;
int index2 = (bits >> 12) & 0x3F;
int index3 = (bits >> 6) & 0x3F;
int index4 = bits & 0x3F;
byte[] result = new byte[]{
(byte) base64[index1], (byte) base64[index2], (byte) base64[index3], (byte) base64[index4]
};
System.out.println(new String(result));
4. Think about a :Base64 Coding must add padding (Padding) Do you ?
In the first article , For lack of 3 individual byte In the process of data conversion , You need to add padding characters to the encoded string = , Let the last output Base64 The string of is exactly 4 Integer multiple . So here comes the question , Why use = Number to fill ? Is it OK not to fill ?
stay android.utils.Base64 In the implementation of , One called NO_PADDING Of flag. If in use , Passed in this flag, The final output result is no embedding = Result . therefore , When coding , For lack of 3 individual byte Data processing of , You can also not fill .
Of course , If filling , In decoding , It can be handled more simply , Directly encode the string ,4 A group of characters is processed , Without judging the number of characters , Take different decoding logic . Of course ,Padding It can also be used to identify the end of coding , To prevent multiple Base64 After splicing together , Decoding failed .
5. Think about the second :Base64 Can it be considered as an encryption method ?
Base64 It's a way of encoding , Convert the original data into a printable string . At work , It is often called Base64 encryption , according to encryption The definition of :
Change the original information data with a special algorithm , Enables unauthorized users to obtain encrypted information , But I don't know how to decrypt it , Still unable to understand the content of the information .
about Base64 Speaking of , In use , It has already complied with common coding standards : RFC 4648 as well as RFC 2045, Use a standard code table for processing , It can't be regarded as “ encryption ”, But if , In the process of encoding and decoding , Use a special code table , It can make the encoded content not easy to be decoded , It can also realize the original content “ encryption ” function .
6. Last
In the text , of Base64 The principle of the coding process and the code implementation have already existed , If you are interested , You can try to realize the coding of decoding , We can better understand the principle . Know what it is , Know why , Writing code is like God's help .
边栏推荐
- Build redis environment under windows and Linux
- Vs code must know and know 20 shortcut keys!
- Summarize the knowledge points of the ten JVM modules. If you don't believe it, you still don't understand it
- 暴力递归到动态规划 01 (机器人移动)
- Producer consumer model of concurrent model
- xxxxx
- How close can QA be to business code QA conducts testability transformation on business code
- 3.1 common neural network layer (I) image correlation layer
- Naive Bayes -- continuous data
- 向日葵远程控制为何采用BGP服务器?自动最优路线、跨运营商高速传输
猜你喜欢

【科技1】

NXP i.mx8mp-deepviewrt

Inclusion exclusion principle

CUDA GDB prompt: /tmp/tmpxft**** cudafe1.stub. c: No such file or directory.
![MOS tube - rapid recovery application notes (II) [parameters and applications]](/img/54/eb040a51304192def8cfb360c7c213.png)
MOS tube - rapid recovery application notes (II) [parameters and applications]

实例搭建Flask服务(简易版)

Violence recursion to dynamic programming 01 (robot movement)

Makefile details

深入C语言(1)——操作符与表达式

AI_ Drug: VAE of molecular generation model (I)
随机推荐
NXP i.mx8mp-deepviewrt
Build redis environment under windows and Linux
通过递归实现多级联动
Summarize the knowledge points of the ten JVM modules. If you don't believe it, you still don't understand it
RTP send and receive h265
How fast does it take to implement a super simple programming language?
Three military product baselines (functional baseline, distribution baseline, product baseline) and the documents contained in the baseline
Precautions for using latex
Simple understanding of Poe and UPS Technology
for_each用法示例
Various minor problems of jupyter notebook, configuration environment, code completion, remote connection, etc
数字孪生实际应用案例-智慧能源篇
深入C语言(1)——操作符与表达式
How to solve the time zone problem in MySQL timestamp
Rdkit II: use rdkit screening to screen 2D pharmacophores of chemical small molecules
Casbin入门
Easy to use remote sensing data set download website~~~
How to realize shortcut keys for interface scaling by vscade
Summary of basic knowledge points of C language
exness:鸽派决议帮助黄金反弹,焦点转向美国GDP