当前位置:网站首页>Blood cases caused by < meta charset=UTF-8> -- Analysis of common character codes
Blood cases caused by < meta charset=UTF-8> -- Analysis of common character codes
2022-07-29 04:03:00 【Boiled water】
<meta charset="UTF-8"> What does that mean? ?
I'm looking for an internship recently , Time is limited , In the future, we must write down the underlying principles clearly
First, explain the meaning of this code :
<meta charset="UTF-8">meta Label head Sub tags in
Set Web page file Display time Character set used
List of common character codes
Code name | explain | remarks |
ASCII | 1、ASCII Encode each letter or symbol in 1byte(8bits), also 8bits The highest place is zero 0, therefore ASCII The only letters and symbols that can be coded are 128 individual . There are some coding handles 8bits The highest bit is 1 After 128 Values are also encoded , bring 1byte Can be said 256 It's worth , But this is extended ASCII, Not standard ASCII. Generally speaking, the standard ASCII Only the front 128 It's worth . 2、ASCII The code is compatible with almost all the codes in the world (UTF16 and UTF32 The exception is ), So if the contents of a text document are all composed of ASCII The letters or symbols inside make up , No matter how you show the content of the document , There can be no garbled code . 3、 Half angle -> A byte ( english )->ASCII-> The compiler knows | |
Unicode | unicode | 1、 International standard character set , It defines a unique code for each character in various languages in the world , To meet cross language needs 、 Cross platform text information conversion . 2、UTF8 The way to solve the separation between characters is that the highest bit in the number binary is continuous 1 To determine how many bytes this word is encoded .0 The beginning is a single byte , and ASCII Code coincidence , Compatible . |
UTF-8(16/32) | UTF-8 It is one of the most widely used Unicode Of Realization way | take Unicode The abstract code bits of the character set are mapped to 8(16/32) Bit long integers ( Code bit ) A sequence of data storage or transmission . |
GB2312 | 1、GB Full name GuoBiao National standard ,GBK Full name GuoBiaoKuozhan National standard extension .GB18030 Encoding compatibility GBK,GBK compatible GB2312, These three codes have a very deep origin . 2、 The earliest national standard for simplified Chinese character coding , Double byte encoding is adopted , Included 7445 Graphic characters , These include 6763 The Chinese characters . 3、GB2312: Full angle -> Two bytes -> The compiler doesn't know | |
BIG5 | Taiwan Traditional Chinese standard character set , Double byte encoding is adopted , Collects 13053 Chinese characters | |
GBK | It's right GB2312 Expansion of coding , On Chinese characters Double byte encoding is adopted .GBK The character set contains 21003 The Chinese characters , Including national standards GB13000-1 All the Chinese, Japanese and Korean characters in , and BIG5 All Chinese characters in the code . | |
GB18030 code | It's right GBK Expansion of coding , Cover Chinese 、 Japanese 、 Korean and Chinese minority languages , It includes 27484 The Chinese characters .GB18030 Character set Use single byte 、 Characters are encoded in three ways: double byte and four byte . compatible GBK and GB2312 Character set |
surface 1 Common character coding table
Compatibility column

chart 1 A list of common character encoding compatibility
In the picture we can see that ,ASCII Compatible with all codes , and The most common UTF8 And GBK In addition to ASCII There is no intersection outside the part , This is also the most common scenario leading to garbled code in normal business , Use UTF8 Read out GBK Encoded text , You may see all kinds of garbled code . and GB Several codes of series ,GB18030 compatible GBK,GBK Compatible GB2312.
quote :
边栏推荐
- LVS+KeepAlived高可用部署实战应用
- Design of environment detection system based on STM32 and Alibaba cloud
- Change the value of the argument by address through malloc and pointer
- 1985-2020(8个版次)全球地表覆盖下载与介绍
- Summary on the thought of double pointer
- Press the missing number of interview question 17.04 | | 260. the number that appears only once (including bit operation knowledge points)
- Extended operator of new features in ES6
- Malloc C language
- 1985-2020 (8 Editions) global surface coverage download and introduction
- CUB_ Visualization of key points in 200 bird dataset
猜你喜欢

LVS+KeepAlived高可用部署实战应用

Configmap configuration and secret encryption

Interview essential! TCP classic 15 consecutive questions!
![[BGP] small scale experiment](/img/58/877e5e454e9bab9d1bccb8fdd3b04d.png)
[BGP] small scale experiment

UCOS task switching process

大厂们终于无法忍受“加一秒”了,微软谷歌Meta等公司提议废除闰秒

关于双指针的思想总结

Basic configuration of BGP - establish peers and route announcements

Design of environment detection system based on STM32 and Alibaba cloud

Typescript from getting started to mastering (XXII) namespace namespace (I)
随机推荐
有没有大佬帮我看下flink sql连接kafka认证kerberos的参数配置是否有误
Common methods of lodash Library
Interview essential! TCP classic 15 consecutive questions!
Configmap configuration and secret encryption
The digitalization of the consumer industry is upgraded to "rigid demand", and weiit's new retail SaaS empowers enterprises!
flink-sql 如何设置 sql执行超时时间
UCOS task switching process
Use case of arrow function of new features in ES6
请问,在sql client中,执行insert into select from job时,如何单
First knowledge of C language (3)
Three tier architecture of enterprise network
Array as function parameter -- pointer constant / constant pointer
LDP -- label distribution protocol
SQL server当存储过程接收的参数是int类型时,如何做判断?
SQL window function
数据源是SQL server ,我要配置日期字段 updateDate 最后两天日期的增量数据,做增
路西法98-生活记录ing
C语言实现三子棋游戏(详解)
Note: restframe work records many to one tables, how to serialize in that table (reverse query)
Press the missing number of interview question 17.04 | | 260. the number that appears only once (including bit operation knowledge points)