当前位置:网站首页>What is Base64?
What is Base64?
2022-07-07 15:37:00 【nsnsttn】
Base64 What is it? ?
Base64 Is a binary to text encoding . If you want to be more specific , It can be considered as a kind of byte
The method of encoding an array into a string , And the encoded string only contains ASCII Basic characters .
Like strings ShuSheng007
Corresponding Base64 by U2h1U2hlbmcwMDc=
. One of those =
A special , Is a filler , Later .
It is worth noting that Base64 Not encryption algorithm , It's just a coding method , The algorithm is also public , So you can't rely on it for encryption .
Why call Base64?
Because it's based on (Base)64 A coding method of characters . The encoded text only contains 64 individual ASCII Code character ( Occasionally add a padding character =
), As shown below :
Base64 Used 64 Characters :
A-Z
26 individuala-z
26 individual0-9
10 individual+
1 individual/
1 individual
The picture below is Base64 clock , You can see from 0 To 63 Each number of corresponds to a character above .
Base64 What problem to solve ?
Base64 Encoding is the encoding from binary values to certain specific characters , These specific characters total 64 individual , So called Base64.
Why not transfer binary directly ? Such as the picture , Or characters , Since they are binary byte streams in actual transmission . And even if Base64 The encoded string is ultimately binary ( Usually UTF-8 code , compatible ASCII code ) Transmitted over the network , Then use 4/3 Times the bandwidth to transmit data Base64 What's the point ?
The real reason is binary incompatibility . Some binary values , On some hardware , For example, in different routers , On the old computer , The meaning of expression is different , The treatment is also different . Again , Some old software , Network protocols have similar problems .
In the project , Compress the message 、 After encryption , The last step is usually base64 code . because base64 Encoded strings are more suitable for different platforms , Transmission of different languages .
base64 Advantages of coding :
- The algorithm is coding , Not compression , After encoding, only the number of bytes will be increased ( Usually more than before 1/3, Like before 3, After encoding, 4)
- Method is simple , Basically does not affect efficiency
- Algorithm reversible , Decoding is very convenient , Not for private transmission .
- After all, it's encoded , The naked eye cannot directly read the original content .
- The encrypted string has only 【0-9a-zA-Z+/=】 Non printable characters ( Translate characters ) It can also transmit
Base64 It is born to solve the problem of binary incompatibility in various systems and transmission protocols
// During binary data transmission , Invisible characters or cannot be represented by UTF-8 Decoded binary data ( A character corresponds to a binary code , But a binary may not correspond to a character ), Data may be lost
public static void main(String[] args) {
// Byte array , Often used to represent binary data
byte[] bytes1 = new byte[]{
31, -117, 8};
// Directly use string to convert byte array ( Default UTF-8 code )
String str = new String(bytes1);
System.out.println(str);//
byte[] bytes2 = str.getBytes();
System.out.println(Arrays.toString(bytes2)); // result [31, -17, -65, -67, 8] And the original bytes1:[31,-117,8] Is not the same
// Why is that ?
//UTF-8 Coding is a kind of variable length coding , A character corresponds to a binary code , But a binary may not correspond to a character
// If the Ascii code (UTF-8 compatible ) Data in , You can do the mapping , And do not lose data
byte[] bytes3 = new byte[]{
49, 50, 51};
String str2 = new String(bytes3);
System.out.println(str2);
byte[] bytes4 = str2.getBytes();
System.out.println(Arrays.toString(bytes4));// result :[49, 50, 51] and bytes3[49,50,51] It's the same
// Most other coding methods , Each character corresponds to a binary code , But a binary code does not necessarily correspond to the previous character
// however Base64 Sure , How to solve it ?
//Base64, use 6 One unit at a time , Go to Ascii Out of 64 A visible character to map (A-Z a-z 0-9 + /),
// In this way, all binaries can be divided into each according to their length 6 Bits correspond to a character , Not enough 6 Multiples of will be used 0 completion , If 6 All places are 0, Then map to = Number
// adopt Base64 After the coding
String encode = Base64.encode(bytes1);
System.out.println(encode);//H4sI
byte[] decode = Base64.decode(encode);
System.out.println(Arrays.toString(decode));// result :[31, -117, 8] And the original bytes1:[31,-117,8] It's the same
}
code Man
Three characters Man Four characters after encoding TWFu
It can be seen that one byte of the original data needs 8 position ,base64 Coding requires 6 position , So the number of bytes of the original data must be 8 And 6 The common factor of , That is to say 3 Multiple
Of course, if the number of bytes to be encoded is not 3 Multiple , You need more 1 or 2 Bytes , Then you can use the following methods to deal with : First use 0 The byte value is complemented at the end , To enable it to be 3 to be divisible by , And then we can move on Base64 The coding . After coding Base64 Add one or two... After the text =
Number , Represents the number of bytes to make up . in other words , When the last two octets are left ( To be supplemented ) byte (2 individual byte) when , the last one 6 Bit Base64 The byte block has four bits (2*6-8=4) yes 0 value , Finally, attach two equal signs ; If the last eight digits are left ( To be supplemented ) byte (1 individual byte) when , the last one 6 Bit base The byte block has two bits (3*6-2*8=2) yes 0 value , Add an equal sign at the end . Refer to the following table :
therefore Base64 The encoded data is slightly longer than the original data , Original 4/3
.
Base64 DataURI Format
Sometimes you find out web Page sent to you base64 The string is preceded by something similar to the following .
data:image/jpeg;base64, /9j/4AA...
This is a DataURI, Most browsers support opening such binary data directly , But we should pay special attention , If you just want to be real Base64 The content needs to be taken ,
Content behind
Base64 variant
Base64 The code can be used in HTTP In the environment, the longer identification information . for example , stay Java Persistence System Hibernate in , We used Base64 To put a long unique identifier ( It's usually 128-bit Of UUID) Encode as a string , Used as a HTTP Forms and HTTP GET URL Parameters in . In other applications , It is also often necessary to code binary data to fit in URL( Including hiding form fields ) In the form of . here , use Base64 The code is not only shorter , It's also unreadable , That is, the encoded data will not be directly seen by human eyes .
However , The standard Base64 It is not suitable to put directly URL In the transmission , because URL The encoder will take the standard Base64 Medium /
and +
The character becomes formal %XX
In the form of , And these %
Numbers also need to be converted when they are stored in the database , because ANSISQL Has been to %
The number is used as a wildcard .
To solve this problem , There is a way of be used for URL Improvement Base64 code , It doesn't fill at the end =
Number , And the standard Base64 Medium +
and /
They were changed to -
and _
, So you don't have to URL The conversion required for codec and database storage , The length of the encoding information is avoided to increase in this process , And unified database 、 The format of object identifiers such as forms .
There is another kind of Improvements for regular expressions Base64 variant , It will +
and /
Changed to !
and -
, because +
,*
And the front is IRCu Used in [
and ]
stay Regular expressions May have special meaning in .
There are also some variations , They will +/
Change it to _-
or ._
( Used as identifier name in programming language ) or .-
( be used for XML Medium Nmtoken) even to the extent that _:
( be used for XML Medium Name).
purpose
- For certificates , Especially the root certificate , It's usually base64 Coded , Downloaded by many people on the Internet
- E-mail attachments are usually base64 code , Because attachments often have invisible characters
- xml If you want to embed another xml file , Direct embedding , Often xml The label is out of order , Not easy to parse , therefore , Need to put xml String compiled into byte array , Compile into visible characters .
- Some small pictures in the web page , You can use base64 The way of coding is embedded , No more link requests consuming network resources .
The other one xml file , Direct embedding , Often xml The label is out of order , Not easy to parse , therefore , Need to put xml String compiled into byte array , Compile into visible characters . - Some small pictures in the web page , You can use base64 The way of coding is embedded , No more link requests consuming network resources .
- Older plain text protocols SMTP , These texts are occasionally transferred to a file , Need to use base64
边栏推荐
- Niuke real problem programming - Day9
- Ctfshow, information collection: web7
- 【數字IC驗證快速入門】26、SystemVerilog項目實踐之AHB-SRAMC(6)(APB協議基本要點)
- Wechat applet 01
- MySQL bit类型解析
- leetcode 241. Different Ways to Add Parentheses 为运算表达式设计优先级(中等)
- [server data recovery] data recovery case of raid failure of a Dell server
- 什么是数据泄露
- What are the safest securities trading apps
- 2. Basic knowledge of golang
猜你喜欢
There is a cow, which gives birth to a heifer at the beginning of each year. Each heifer has a heifer at the beginning of each year since the fourth year. Please program how many cows are there in the
【數據挖掘】視覺模式挖掘:Hog特征+餘弦相似度/k-means聚類
Do you know the relationship between the most important indicators of two strong wind control and the quality of the customer base
【兰州大学】考研初试复试资料分享
[server data recovery] data recovery case of raid failure of a Dell server
【数字IC验证快速入门】19、SystemVerilog学习之基本语法6(线程内部通信...内含实践练习)
Niuke real problem programming - Day11
Ctfshow, information collection: web13
如何在opensea批量发布NFT(Rinkeby测试网)
CTFshow,信息搜集:web4
随机推荐
Jacobo code coverage
CTFshow,信息搜集:web5
MySQL bit type resolution
[quick start of Digital IC Verification] 22. Ahb-sramc of SystemVerilog project practice (2) (Introduction to AMBA bus)
Niuke real problem programming - day18
MongoDB数据库基础知识整理
【數據挖掘】視覺模式挖掘:Hog特征+餘弦相似度/k-means聚類
Unity's ASE achieves full screen sand blowing effect
Ctfshow, information collection: web4
[quick start of Digital IC Verification] 25. AHB sramc of SystemVerilog project practice (5) (AHB key review, key points refining)
@Introduction and three usages of controlleradvice
【数字IC验证快速入门】18、SystemVerilog学习之基本语法5(并发线程...内含实践练习)
What is data leakage
【数字IC验证快速入门】19、SystemVerilog学习之基本语法6(线程内部通信...内含实践练习)
Why do we use UTF-8 encoding?
There is a cow, which gives birth to a heifer at the beginning of each year. Each heifer has a heifer at the beginning of each year since the fourth year. Please program how many cows are there in the
连接ftp服务器教程
【Markdown语法高级】让你的博客更精彩(四:设置字体样式以及颜色对照表)
Runnable是否可以中断
leetcode 241. Different Ways to Add Parentheses 为运算表达式设计优先级(中等)