当前位置:网站首页>C language char, wchar_ t, char16_ t, char32_ Relationship between T and character set

C language char, wchar_ t, char16_ t, char32_ Relationship between T and character set

2022-07-06 21:34:00 landian004

1,char It's not fixed in the standard 8 position , But it's fixed c/c++ Medium 1 byte !

But usually 8 position . And it's 8 When a , Share Signed and unsigned , When there is no sign 0-255, Yes when there is a symbol 0-127. When there is a symbol, it corresponds to Only ASCII Character set , because iso-8859-1 or windows1252 or EBCDIC yes 256 A character . Our computer is usually The signed , Which is the corresponding ASCII Character set .win,mac and linux You can verify ( How to verify ?)

2,wchar_t yes 16 Bit or 32 position (linux Is in the 32 position ,win Is in the 16 position ), So it's not portable .
wchar_t Also follow setlocale Functions are related , Must be used setlocale function , Its corresponding character set does not know to follow setlocale What kind of connection , second ,setlocale I don't know char16_t and char32_t Is there a connection ? in general , Because it is not portable , In less than wchar_t, And directly use the fixed width character set char16_t and char32_t.

3,char16_t Is to determine the 16 position , It's also certain utf16 Character set . and ucs-2 That's for sure 16 position ,utf16 It's not certain 16 position , It grows ( It can be 16 Bit and 32 position ).char16_t It can store all utf16 Of code units, instead of code points( That is, all characters ), stay utf16 The coding scheme exceeds 16 Characters with bit length are 2 individual code units To express . That is to say utf16 In excess 16 Bit long characters are 2 individual char16_t To express .

4,char32_t Is to determine the 32 position , affirmatory utf32 . But the disadvantage is that it wastes space !

5, however In practice , Out-of-service char16_t , char32_t, The reason is that there is no language and standard library this 2 Kind of c11 New types of output methods !! So I can only use wchar_t!!

char* str=" chinese ";   //  incorrect 
wchar_t* str2 = L" chinese ";  // Use this 
char16_t str3 = u" chinese ";  // No, wprintf Equal output function ! So it can't be used !

6, use char* str = " chinese "; printf("%s", str); It can also print correctly , But with strlen() Function length measurement is wrong , Should not use char* or const char* or char str[] To represent Chinese string . problem :char* str and printf( instead of wprintf) What is the reason why Chinese strings can also be correctly represented and printed ?

7,char and wchar_t Representation and printing of all Only console programs are used , When writing a graphical interface program, the corresponding library has the function of Chinese output in the interface , such as SDL in ? But when the console program expresses and prints Chinese , Although not wchar_t and wprintf and setlocale() Function still handles correctly , But the correct way is still to use wchar_t,setlocale(),wprintf These represent .

8,setlocale(LC_ALL, "zh-CN"); // "zh-CN" or “zh-CN.UTF-8" or "", this 3 Any representation will do , Recommended for portability "zh-CN" This kind of writing ( The actual test is "zh-CN.UTF-8" It's OK , It is case insensitive )."zh-CN" There are more expressions here . however setlocale The principle of is still unclear ??

The above is the review stage 1 Summary of .

原网站

版权声明
本文为[landian004]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/187/202207061307418787.html