当前位置:网站首页>UTF encoding and character set in golang
UTF encoding and character set in golang
2022-07-04 21:04:00 【Nanyidao street】
One 、UTF Coding and Golang Character set
1. Character set
A bit is either 1 Or 0, In any case, you can't get a letter A, We can take these for example A-Z A mapping relationship between the characters of and numbers , such as 0100 0001 representative A, Then we create a character set , Collect these mapping relationships , Get a character number comparison table , Just Called character set


2.ASCII Character set
ASCII Only 128 character , The extended character set has 256 individual

3.GB2312 Character set
ASCII Chinese characters are not supported , And then there is GB2312 Character set

4.Unicode Character set
There are many characters not included in the above character set , We want to make a general character set ,Unicode This is what the association does
5. Fixed length coding , Variable length coding
5.1 Fixed length coding
If you want to express "eggo The world ", We use it directly Unicode Character sets get their numbers , But how to divide the order after getting the number is another problem , For example, it is randomly divided into " lean to one side "


terms of settlement : No matter how long these characters are , Unify according to the longest boundary , The number of digits is not high enough to fill 0, The character boundary problem is solved ,
The new problem : Waste of memory , And the more symbols in the character set , The larger the coding span ,“ Fixed length coding wastes significantly ”, We have to find a way to solve the problem of memory consumption

5.2 Variable length coding
Fixed length coding is not OK , We use variable length coding , Small numbers use fewer bytes , Large number multi-purpose bytes
The solution is as follows :
[0,127] One byte , The highest flag bit is 0
[128,2047] Two bytes , Highest flag bit 110, There are also fixed flags 10
[2048,65535], Highest flag bit 1110, There are two fixed flag bits 10
01100101, The highest byte is 0, Remove the flag bit , The other corresponding is e
11100100 10111000 1001011 use 1110 start , Remove the three flag bits , The remaining parts are combined , You can get the world " the "

6.UT8 Detailed explanation
UTF-8 It is variable length encoding , It can be used 1~4 Byte representation ,
The coding rules are as follows :
1. For one byte , The first is 0, be left over 7 For use Unicode Coding means
2. about n Bytes (n>1), The first byte of front n Position as 1, Bytes left The first two are 10
边栏推荐
- word中使用自动插入题注功能
- hash 表的概念及应用
- From automation to digital twins, what can Tupo do?
- go笔记(3)Go语言fmt包的用法
- Idea configuration standard notes
- colResizable.js自动调整表格宽度插件
- Understand Alibaba cloud's secret weapon "dragon architecture" in the article "science popularization talent"
- Flet tutorial 07 basic introduction to popupmenubutton (tutorial includes source code)
- Vue cleans up the keepalive cache scheme in a timely manner
- 面对同样复杂的测试任务为什么大老很快能梳理解决方案,阿里十年测试工程师道出其中的技巧
猜你喜欢

How does the computer save web pages to the desktop for use

伦敦银走势图分析的新方法

What if win11u disk refuses access? An effective solution to win11u disk access denial

Solution of 5g unstable 5g signal often dropped in NetWare r7000 Merlin system

网件r7000梅林系统5g不稳定 5g信号经常掉线解决方法

Flet tutorial 06 basic introduction to textbutton (tutorial includes source code)

Advantages of RFID warehouse management system solution

c语言函数形参自增自减情况分析

Summary of the mistakes in the use of qpainter in QT gobang man-machine game

科普达人丨一文看懂阿里云的秘密武器“神龙架构”
随机推荐
shp数据制作3DTiles白膜
MySQL --- 数据库查询 - 聚合函数的使用、聚合查询、分组查询
After inserting a picture into word, there is a blank line above the picture, and the layout changes after deletion
In the face of the same complex test task, why can the elder sort out the solution quickly? Ali's ten-year test engineers showed their skills
[1200. Différence absolue minimale]
卷积神经网络在深度学习中新发展的5篇论文推荐
分析伦敦银走势图的技巧
LeetCode 8. String conversion integer (ATOI)
Managed service network: application architecture evolution in the cloud native Era
黄金k线图中的三角形有几种?
LeetCode+ 81 - 85 单调栈专题
Fleet tutorial 08 introduction to AppBar toolbar Basics (tutorial includes source code)
MySQL statement execution details
Advantages of RFID warehouse management system solution
Go notes (1) go language introduction and characteristics
实战模拟│JWT 登录认证
How does wincc7.5 SP1 find variables and their positions through cross indexing?
Actual combat simulation │ JWT login authentication
JS closure
冰河的海报封面