当前位置:网站首页>What is Base64?
What is Base64?
2022-07-07 15:37:00 【nsnsttn】
Base64 What is it? ?
Base64 Is a binary to text encoding . If you want to be more specific , It can be considered as a kind of byte
The method of encoding an array into a string , And the encoded string only contains ASCII Basic characters .
Like strings ShuSheng007
Corresponding Base64 by U2h1U2hlbmcwMDc=
. One of those =
A special , Is a filler , Later .
It is worth noting that Base64 Not encryption algorithm , It's just a coding method , The algorithm is also public , So you can't rely on it for encryption .
Why call Base64?
Because it's based on (Base)64 A coding method of characters . The encoded text only contains 64 individual ASCII Code character ( Occasionally add a padding character =
), As shown below :
Base64 Used 64 Characters :
A-Z
26 individuala-z
26 individual0-9
10 individual+
1 individual/
1 individual
The picture below is Base64 clock , You can see from 0 To 63 Each number of corresponds to a character above .
Base64 What problem to solve ?
Base64 Encoding is the encoding from binary values to certain specific characters , These specific characters total 64 individual , So called Base64.
Why not transfer binary directly ? Such as the picture , Or characters , Since they are binary byte streams in actual transmission . And even if Base64 The encoded string is ultimately binary ( Usually UTF-8 code , compatible ASCII code ) Transmitted over the network , Then use 4/3 Times the bandwidth to transmit data Base64 What's the point ?
The real reason is binary incompatibility . Some binary values , On some hardware , For example, in different routers , On the old computer , The meaning of expression is different , The treatment is also different . Again , Some old software , Network protocols have similar problems .
In the project , Compress the message 、 After encryption , The last step is usually base64 code . because base64 Encoded strings are more suitable for different platforms , Transmission of different languages .
base64 Advantages of coding :
- The algorithm is coding , Not compression , After encoding, only the number of bytes will be increased ( Usually more than before 1/3, Like before 3, After encoding, 4)
- Method is simple , Basically does not affect efficiency
- Algorithm reversible , Decoding is very convenient , Not for private transmission .
- After all, it's encoded , The naked eye cannot directly read the original content .
- The encrypted string has only 【0-9a-zA-Z+/=】 Non printable characters ( Translate characters ) It can also transmit
Base64 It is born to solve the problem of binary incompatibility in various systems and transmission protocols
// During binary data transmission , Invisible characters or cannot be represented by UTF-8 Decoded binary data ( A character corresponds to a binary code , But a binary may not correspond to a character ), Data may be lost
public static void main(String[] args) {
// Byte array , Often used to represent binary data
byte[] bytes1 = new byte[]{
31, -117, 8};
// Directly use string to convert byte array ( Default UTF-8 code )
String str = new String(bytes1);
System.out.println(str);//
byte[] bytes2 = str.getBytes();
System.out.println(Arrays.toString(bytes2)); // result [31, -17, -65, -67, 8] And the original bytes1:[31,-117,8] Is not the same
// Why is that ?
//UTF-8 Coding is a kind of variable length coding , A character corresponds to a binary code , But a binary may not correspond to a character
// If the Ascii code (UTF-8 compatible ) Data in , You can do the mapping , And do not lose data
byte[] bytes3 = new byte[]{
49, 50, 51};
String str2 = new String(bytes3);
System.out.println(str2);
byte[] bytes4 = str2.getBytes();
System.out.println(Arrays.toString(bytes4));// result :[49, 50, 51] and bytes3[49,50,51] It's the same
// Most other coding methods , Each character corresponds to a binary code , But a binary code does not necessarily correspond to the previous character
// however Base64 Sure , How to solve it ?
//Base64, use 6 One unit at a time , Go to Ascii Out of 64 A visible character to map (A-Z a-z 0-9 + /),
// In this way, all binaries can be divided into each according to their length 6 Bits correspond to a character , Not enough 6 Multiples of will be used 0 completion , If 6 All places are 0, Then map to = Number
// adopt Base64 After the coding
String encode = Base64.encode(bytes1);
System.out.println(encode);//H4sI
byte[] decode = Base64.decode(encode);
System.out.println(Arrays.toString(decode));// result :[31, -117, 8] And the original bytes1:[31,-117,8] It's the same
}
code Man
Three characters Man Four characters after encoding TWFu
It can be seen that one byte of the original data needs 8 position ,base64 Coding requires 6 position , So the number of bytes of the original data must be 8 And 6 The common factor of , That is to say 3 Multiple
Of course, if the number of bytes to be encoded is not 3 Multiple , You need more 1 or 2 Bytes , Then you can use the following methods to deal with : First use 0 The byte value is complemented at the end , To enable it to be 3 to be divisible by , And then we can move on Base64 The coding . After coding Base64 Add one or two... After the text =
Number , Represents the number of bytes to make up . in other words , When the last two octets are left ( To be supplemented ) byte (2 individual byte) when , the last one 6 Bit Base64 The byte block has four bits (2*6-8=4) yes 0 value , Finally, attach two equal signs ; If the last eight digits are left ( To be supplemented ) byte (1 individual byte) when , the last one 6 Bit base The byte block has two bits (3*6-2*8=2) yes 0 value , Add an equal sign at the end . Refer to the following table :
therefore Base64 The encoded data is slightly longer than the original data , Original 4/3
.
Base64 DataURI Format
Sometimes you find out web Page sent to you base64 The string is preceded by something similar to the following .
data:image/jpeg;base64, /9j/4AA...
This is a DataURI, Most browsers support opening such binary data directly , But we should pay special attention , If you just want to be real Base64 The content needs to be taken ,
Content behind
Base64 variant
Base64 The code can be used in HTTP In the environment, the longer identification information . for example , stay Java Persistence System Hibernate in , We used Base64 To put a long unique identifier ( It's usually 128-bit Of UUID) Encode as a string , Used as a HTTP Forms and HTTP GET URL Parameters in . In other applications , It is also often necessary to code binary data to fit in URL( Including hiding form fields ) In the form of . here , use Base64 The code is not only shorter , It's also unreadable , That is, the encoded data will not be directly seen by human eyes .
However , The standard Base64 It is not suitable to put directly URL In the transmission , because URL The encoder will take the standard Base64 Medium /
and +
The character becomes formal %XX
In the form of , And these %
Numbers also need to be converted when they are stored in the database , because ANSISQL Has been to %
The number is used as a wildcard .
To solve this problem , There is a way of be used for URL Improvement Base64 code , It doesn't fill at the end =
Number , And the standard Base64 Medium +
and /
They were changed to -
and _
, So you don't have to URL The conversion required for codec and database storage , The length of the encoding information is avoided to increase in this process , And unified database 、 The format of object identifiers such as forms .
There is another kind of Improvements for regular expressions Base64 variant , It will +
and /
Changed to !
and -
, because +
,*
And the front is IRCu Used in [
and ]
stay Regular expressions May have special meaning in .
There are also some variations , They will +/
Change it to _-
or ._
( Used as identifier name in programming language ) or .-
( be used for XML Medium Nmtoken) even to the extent that _:
( be used for XML Medium Name).
purpose
- For certificates , Especially the root certificate , It's usually base64 Coded , Downloaded by many people on the Internet
- E-mail attachments are usually base64 code , Because attachments often have invisible characters
- xml If you want to embed another xml file , Direct embedding , Often xml The label is out of order , Not easy to parse , therefore , Need to put xml String compiled into byte array , Compile into visible characters .
- Some small pictures in the web page , You can use base64 The way of coding is embedded , No more link requests consuming network resources .
The other one xml file , Direct embedding , Often xml The label is out of order , Not easy to parse , therefore , Need to put xml String compiled into byte array , Compile into visible characters . - Some small pictures in the web page , You can use base64 The way of coding is embedded , No more link requests consuming network resources .
- Older plain text protocols SMTP , These texts are occasionally transferred to a file , Need to use base64
边栏推荐
- [quick start of Digital IC Verification] 29. Ahb-sramc (9) (ahb-sramc svtb overview) of SystemVerilog project practice
- [deep learning] semantic segmentation experiment: UNET network /msrc2 dataset
- Ctfshow, information collection: web2
- Ctfshow, information collection: web1
- CTFshow,信息搜集:web12
- Unity's ASE achieves full screen sand blowing effect
- 微信小程序 01
- @ComponentScan
- 最安全的证券交易app都有哪些
- Ctfshow, information collection: web10
猜你喜欢
Ctfshow, information collection: web7
【服务器数据恢复】戴尔某型号服务器raid故障的数据恢复案例
【深度学习】语义分割实验:Unet网络/MSRC2数据集
2022年5月互联网医疗领域月度观察
CTFshow,信息搜集:web4
数学建模——什么是数学建模
[server data recovery] a case of RAID data recovery of a brand StorageWorks server
"Baidu Cup" CTF competition 2017 February, web:include
CTFshow,信息搜集:web3
【数字IC验证快速入门】29、SystemVerilog项目实践之AHB-SRAMC(9)(AHB-SRAMC SVTB Overview)
随机推荐
Database exception resolution caused by large table delete data deletion
Mathematical modeling -- what is mathematical modeling
Oracle control file loss recovery archive mode method
[Data Mining] Visual Pattern Mining: Hog Feature + cosinus Similarity / K - means Clustering
8大模块、40个思维模型,打破思维桎梏,满足你工作不同阶段、场景的思维需求,赶紧收藏慢慢学
HW初级流量监控,到底该怎么做
Implementation of crawling web pages and saving them to MySQL using the scrapy framework
Compile advanced notes
2022 all open source enterprise card issuing network repair short website and other bugs_ 2022 enterprise level multi merchant card issuing platform source code
MongoDB数据库基础知识整理
居然从408改考自命题!211华北电力大学(北京)
MySQL bit类型解析
[quick start of Digital IC Verification] 20. Basic grammar of SystemVerilog learning 7 (coverage driven... Including practical exercises)
银行需要搭建智能客服模块的中台能力,驱动全场景智能客服务升级
A need to review all the knowledge, H5 form is blocked by the keyboard, event agent, event delegation
【数据挖掘】视觉模式挖掘:Hog特征+余弦相似度/k-means聚类
连接ftp服务器教程
HW primary flow monitoring, what should we do
leetcode 241. Different Ways to Add Parentheses 为运算表达式设计优先级(中等)
Niuke real problem programming - day13