当前位置:网站首页>Deep understanding of Base64 underlying principles
Deep understanding of Base64 underlying principles
2022-07-29 03:38:00 【Luo Zhaocheng CSDN】
Base64 Is a common data encoding method , Commonly used for data transmission . For mobile developers , Network requests often use . Yes JSON Familiar students all know ,JSON The serialization tool of does not support byte Array directly into JSON In the data , For this binary data , In dealing with , All need to be done Base64 code , Then put the encoded string into JSON in . From the beginning to now ,Base64 It's a very common function , From the Internet “ Carry ” Base64 Codec tools to use Android SDK The built-in Base64. Although skilled in use , But I didn't understand it deeply Base64 The underlying mechanism of .
Coincidentally , In the near future , I met again Base64 Related matters of , So I went to have a relatively in-depth understanding Base64 Related content of . The following is only for your own learning records .
1. Base64 Definition
Base64 It's based on 64 A representation of binary data that represents three printable characters , In human words, you can call Binary system Data is printed on the console , It can copy and transmit well .
Used in coding 64 Printable characters only need 6 individual Bit You can say : 2 6 = 64 2^6 = 64 26=64
In the computer world ,3 individual Byte Binary data of is equivalent to 24 individual Bit , Therefore, when encoding, you need to correspond 4 individual Base64 To represent .
Base64 The printable character set of contains letters A-Z、a-Z、 Numbers 0-9, That's it 62 Characters , The other two may be different in different systems , We usually use + 、 /, stay URL Safe In the mode of -、_
Base64 clock :

2. Base64 transformation
We will byte Every one of the arrays byte All converted into 8 position bit, Form a string of binary data , Then press 6 Bit by bit , Then every 6 Bits are converted to a bit , Will be converted bit As index , Go to the code table to find the corresponding displayed characters .
The coding example is as follows :

As shown in the figure above , take LZC convert to Base64 The encoded printable string is TFpD. From this example, we can see , Just enough to 3 individual byte Turn into 4 individual byte.
Smart you may have found the existing problems , If you want to encode the original byte[] What should I do if the length of cannot be divided by three ?
In the code table , Except for those used in coding 64 Characters , There is also a padding character at the end : = , Used to fill in insufficient digits .
Illustrate with examples :

3. Base64 encode Code implementation (Java edition )
- Put the existing 3 individual byte
a, b, cCode as Base64 String of
final char[] base64 = {
'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z',
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '+', '/'
};
byte a = 'L';
byte b = 'Z';
byte c = 'C';
int index1 = (a >> 2) & 0b00111111;
int index2 = ((a << 4) & 0b00110000) | ((b >> 4) & 0b00001111);
int index3 = ((b << 2) & 0b00111100) | ((c >> 6) & 0b00000011);
int index4 = c & 0b00111111;
byte[] result = new byte[]{
(byte) base64[index1], (byte) base64[index2], (byte) base64[index3], (byte) base64[index4]
};
System.out.println(new String(result));
// TFpD
- Put the existing 2 individual byte
a, bCode as Base64 String of
byte a = 'Z';
byte b = 'C';
int index1 = (a >> 2) & 0b00111111;
int index2 = ((a << 4) & 0b00110000) | ((b >> 4) & 0b00001111);
int index3 = ((b << 2) & 0b00111111);
byte[] result = new byte[]{
(byte) base64[index1], (byte) base64[index2], (byte) base64[index3], '='
};
System.out.println(new String(result));
// WkM=
- Put the existing 1 individual byte
aCode as Base64 String of
byte a = 'L';
int index1 = (a >> 2) & 0b00111111;
int index2 = ((a << 4) & 0b00111111);
byte[] result = new byte[]{
(byte) base64[index1], (byte) base64[index2], '=', '='
};
System.out.println(new String(result));
// TA==
From code , You can see the conversion steps described above . But in calculating index2 as well as index3 When , It's still a little winding , Is there a better way to calculate ?
stay JDK Of Base64 In the implementation of , See a new implementation . stay Java in , One int Number of type 4 individual byte , So you can convert 3 position byte In a int In the data .
As shown in the figure below , take LZC Three byte Put one in int Type variable :

Code implementation :
byte a = 'L';
byte b = 'Z';
byte c = 'C';
int bits = (a & 0xFF) << 16 | (b & 0xFF) << 8 | (c & 0xFF);
// 10011000101101001000011
Get bits after , You can directly perform bit operations , Take out the corresponding 4 individual 6 Bit binary number . The code implementation is as follows :
// 0x3f == 0b00111111
int index1 = (bits >> 18) & 0x3F;
int index2 = (bits >> 12) & 0x3F;
int index3 = (bits >> 6) & 0x3F;
int index4 = bits & 0x3F;
byte[] result = new byte[]{
(byte) base64[index1], (byte) base64[index2], (byte) base64[index3], (byte) base64[index4]
};
System.out.println(new String(result));
4. Think about a :Base64 Coding must add padding (Padding) Do you ?
In the first article , For lack of 3 individual byte In the process of data conversion , You need to add padding characters to the encoded string = , Let the last output Base64 The string of is exactly 4 Integer multiple . So here comes the question , Why use = Number to fill ? Is it OK not to fill ?
stay android.utils.Base64 In the implementation of , One called NO_PADDING Of flag. If in use , Passed in this flag, The final output result is no embedding = Result . therefore , When coding , For lack of 3 individual byte Data processing of , You can also not fill .
Of course , If filling , In decoding , It can be handled more simply , Directly encode the string ,4 A group of characters is processed , Without judging the number of characters , Take different decoding logic . Of course ,Padding It can also be used to identify the end of coding , To prevent multiple Base64 After splicing together , Decoding failed .
5. Think about the second :Base64 Can it be considered as an encryption method ?
Base64 It's a way of encoding , Convert the original data into a printable string . At work , It is often called Base64 encryption , according to encryption The definition of :
Change the original information data with a special algorithm , Enables unauthorized users to obtain encrypted information , But I don't know how to decrypt it , Still unable to understand the content of the information .
about Base64 Speaking of , In use , It has already complied with common coding standards : RFC 4648 as well as RFC 2045, Use a standard code table for processing , It can't be regarded as “ encryption ”, But if , In the process of encoding and decoding , Use a special code table , It can make the encoded content not easy to be decoded , It can also realize the original content “ encryption ” function .
6. Last
In the text , of Base64 The principle of the coding process and the code implementation have already existed , If you are interested , You can try to realize the coding of decoding , We can better understand the principle . Know what it is , Know why , Writing code is like God's help .
边栏推荐
- 深入C语言(1)——操作符与表达式
- Summary of basic knowledge points of C language
- "Strangers once met" Summer Street Shen Shuyan_ Xia Mo Shen Shuyan's latest chapter
- Deep into C language (3) -- input and output stream of C
- The difference between /g /m /i of JS regular expressions
- Build redis environment under windows and Linux
- for_ Example of each usage
- exness:鸽派决议帮助黄金反弹,焦点转向美国GDP
- RTP 发送 和接收 h265
- How to solve the time zone problem in MySQL timestamp
猜你喜欢

(newcoder 15079) irrelevant (inclusion exclusion principle)

Bingbing learning notes: operator overloading -- implementation of date class

Excel拼接数据库语句
![Machine learning [numpy]](/img/6b/3e7f08c5d379ce35687e4f14545929.png)
Machine learning [numpy]

Exness: dove resolution helped gold rebound, and the focus turned to U.S. GDP

容斥原理

Deep into C language (1) -- operators and expressions

Rdkit II: use rdkit screening to screen 2D pharmacophores of chemical small molecules

Why is continuous integration and deployment important in development?

Makefile details
随机推荐
Whole process record of yolov3 target detection
Rongyun IM & RTC capabilities on new sites
The difference between /g /m /i of JS regular expressions
安装抓包证书
three.js 第五十四用如何给shader传递结构体数组
xxxxx
Matlab learning -- structured programs and user-defined functions
(2022 Hangdian multi school III) 1002 boss rush (pressure dp+ dichotomy)
GJB common confused concepts
RHCE的at,crontab的基本操作,chrony服务和对称加密和非对称加密
Implement Lmax disruptor queue from scratch (VI) analysis of the principle of disruptor solving pseudo sharing and consumers' elegant stopping
Flutter 启动白屏
Deep into C language (3) -- input and output stream of C
《陌路曾相逢》夏陌沈疏晏_夏陌沈疏晏最新章节
如何判定是stun协议
CUDA GDB prompt: /tmp/tmpxft**** cudafe1.stub. c: No such file or directory.
Leetcode 1331 array sequence number conversion [map] the leetcode path of heroding
Simple use of eventbus
深入C语言(2)——结构的定义与使用
Overestimated test driven development?