当前位置:网站首页>UTF encoding and character set in golang
UTF encoding and character set in golang
2022-07-04 21:04:00 【Nanyidao street】
One 、UTF Coding and Golang Character set
1. Character set
A bit is either 1 Or 0, In any case, you can't get a letter A, We can take these for example A-Z A mapping relationship between the characters of and numbers , such as 0100 0001 representative A, Then we create a character set , Collect these mapping relationships , Get a character number comparison table , Just Called character set


2.ASCII Character set
ASCII Only 128 character , The extended character set has 256 individual

3.GB2312 Character set
ASCII Chinese characters are not supported , And then there is GB2312 Character set

4.Unicode Character set
There are many characters not included in the above character set , We want to make a general character set ,Unicode This is what the association does
5. Fixed length coding , Variable length coding
5.1 Fixed length coding
If you want to express "eggo The world ", We use it directly Unicode Character sets get their numbers , But how to divide the order after getting the number is another problem , For example, it is randomly divided into " lean to one side "


terms of settlement : No matter how long these characters are , Unify according to the longest boundary , The number of digits is not high enough to fill 0, The character boundary problem is solved ,
The new problem : Waste of memory , And the more symbols in the character set , The larger the coding span ,“ Fixed length coding wastes significantly ”, We have to find a way to solve the problem of memory consumption

5.2 Variable length coding
Fixed length coding is not OK , We use variable length coding , Small numbers use fewer bytes , Large number multi-purpose bytes
The solution is as follows :
[0,127] One byte , The highest flag bit is 0
[128,2047] Two bytes , Highest flag bit 110, There are also fixed flags 10
[2048,65535], Highest flag bit 1110, There are two fixed flag bits 10
01100101, The highest byte is 0, Remove the flag bit , The other corresponding is e
11100100 10111000 1001011 use 1110 start , Remove the three flag bits , The remaining parts are combined , You can get the world " the "

6.UT8 Detailed explanation
UTF-8 It is variable length encoding , It can be used 1~4 Byte representation ,
The coding rules are as follows :
1. For one byte , The first is 0, be left over 7 For use Unicode Coding means
2. about n Bytes (n>1), The first byte of front n Position as 1, Bytes left The first two are 10
边栏推荐
- Browser render page pass
- Android原生数据库的基本使用和升级
- 接口设计时的一些建议
- 测试员的算法面试题-找众数
- 看腾讯大老如何做接口自动化测试
- Summary of the mistakes in the use of qpainter in QT gobang man-machine game
- HMS Core 机器学习服务
- What are the functional modules of RFID warehouse management system solution
- Solution of 5g unstable 5g signal often dropped in NetWare r7000 Merlin system
- Managed service network: application architecture evolution in the cloud native Era
猜你喜欢

What if win11u disk refuses access? An effective solution to win11u disk access denial

PS vertical English and digital text how to change direction (vertical display)

阿里测试师用UI自动化测试实现元素定位

How does wincc7.5 SP1 find variables and their positions through cross indexing?

WinCC7.5 SP1如何通过交叉索引来寻找变量及其位置?

Gobang go to work fishing tools can be LAN / man-machine

剑指 Offer II 80-100(持续更新)

RFID仓库管理系统解决方案有哪些功能模块

多模输入事件分发机制详解

MySQL - database query - use of aggregate function, aggregate query, grouping query
随机推荐
浏览器渲染页面过程
[server data recovery] a case of RAID5 data recovery stored in a brand of server
RFID仓储管理系统解决方案的优点
colResizable.js自动调整表格宽度插件
【观察】联想:3X(1+N)智慧办公解决方案,释放办公生产力“乘数效应”
工厂从自动化到数字孪生,图扑能干什么?
LeetCode 7. 整数反转
伦敦银走势图分析的新方法
jekins初始化密码没有或找不到
Après l'insertion de l'image dans le mot, il y a une ligne vide au - dessus de l'image, et la disposition est désordonnée après la suppression
How does the computer save web pages to the desktop for use
Advantages of RFID warehouse management system solution
网件r7000梅林系统虚拟内存创建失败,提示USB磁盘读写速度不满足要求解决办法,有需要创建虚拟内存吗??
Record the online bug solving list (unfinished to be continued 7/4)
LeetCode+ 81 - 85 单调栈专题
PS竖排英文和数字文字怎么改变方向(变竖直显示)
哈希表、哈希函数、布隆过滤器、一致性哈希
托管式服务网络:云原生时代的应用体系架构进化
What are the functional modules of RFID warehouse management system solution
In the face of the same complex test task, why can the elder sort out the solution quickly? Ali's ten-year test engineers showed their skills