当前位置:网站首页>String and underlying character types of go data type
String and underlying character types of go data type
2022-06-30 07:52:00 【weixin_ fifty-nine million two hundred and eighty-four thousand】
character string
Basic use
stay Go In language , String is a basic type , The default is through UTF-8 Encoded character sequence , When the character is ASCII Code time takes up 1 Bytes , Use other characters as needed 2-4 Bytes , For example, Chinese coding usually needs 3 Bytes .
Declaration and initialization
The declaration and initialization of strings are very simple , Examples are as follows :
var str string // Declare string variables
str = "Hello World" // Variable initialization
str2 := " Hello , Academician " // It can also be declared and initialized at the same time Format output
You can also use Go Language built in len() Function to get the length of the specified string , And by fmt Provided by the package Printf Format string output :
fmt.Printf("The length of \"%s\" is %d \n", str, len(str))
fmt.Printf("The first character of \"%s\" is %c.\n", str, ch)Escape character
Go Language strings do not support single quotes , String literals can only be defined in double quotes , If you want to escape a specific character , Can pass \ Realization , Just as we escaped double quotation marks and line breaks in the string above , Common characters that need to be escaped are as follows :
\n: A newline\r: A carriage return\t:tab key\uor \U :Unicode character\\: The backslash itself
therefore , The output result of the above print code is :
The length of "Hello world" is 11
The first character of "Hello world" is H. besides , You can include... In a string as follows ":
label := `Search results for "Golang":`Multiline string
For multiline strings , It can also be done through ` structure :
results := `Search results for "Golang":
- Go
- Golang
Golang Programming
`
fmt.Printf("%s", results)The results are as follows :
Search results for "Golang":
- Go
- Golang
- Golang Programming Of course , Use + Connectors are also possible :
results := "Search results for \"Golang\":\n" +
"- Go\n" +
"- Golang\n" +
"- Golang Programming\n"
fmt.Printf("%s", results)The results are the same , But you have to input many more characters , It is not as elegant as the previous one .
Immutable value type
Although the characters in the string can be accessed through array subscript :
ch := str[0] // Take the first character of the string But unlike arrays , stay Go In language , A string is an immutable value type , Once initialized , Its contents cannot be modified , Take the following example for example :
str := "Hello world"
str[0] = 'X' // Compile error The compiler will report an error similar to the following :
cannot assign to str[0]Character encoding
Go The default string in the language is UTF-8 Coded Unicode Character sequence , So it can include non ANSI character , such as 「Hello, Academician 」 Can appear in Go In the code .
But it should be noted that , If your Go Code needs to contain non ANSI character , Please note that the encoding format must be selected when saving the source file UTF-8. Especially in Windows General Editors in the lower level are saved as local codes by default , For example, China may be GBK Code instead of UTF-8, If you don't notice this, there will be some unexpected situations when compiling and running .
The encoding and conversion of strings is to process text documents ( such as TXT、XML、HTML etc. ) Is a very common requirement , however Go By default, the language only supports UTF-8 and Unicode code , For other codes ,Go The language standard library does not have built-in transcoding support .
String manipulation
String connection
Go Built in provides rich string functions , Common operations include connecting 、 Get the length and the specified characters , Getting the length and specifying the characters has been described earlier , The string connection only needs to be through + The connector is OK :
str = str + ", Application development "
str += ", Application development " // The above statement can also be abbreviated as , The effect is exactly the same in addition , Another thing to note is that if the string length is long , Need a new line , be + The connector must appear at the end of the previous line , Otherwise, an error will be reported :
str = str +
", Application development "String slice
stay Go In language , The function of obtaining substrings can be realized through string slicing :
str := "hello, world"
str1 := str[:5] // Get index 5( Not included ) Previous substring
str2 := str[7:] // Get index 7( contain ) After the string
str3 := str[0:5] // Get from index 0( contain ) To the index 5( Not included ) Between the strings
fmt.Println("str1:", str1)
fmt.Println("str2:", str2)
fmt.Println("str3:", str3)Go Slice interval can be understood by comparing the concept of interval in mathematics , It's a Left closed right away The range of , For example str[0:5] The interval corresponding to the string element is [0,5),str[:5] The corresponding interval is [0,5)( Array index from 0 Start ),str[7:] The corresponding interval is [7:len(str)]( This is a closed interval , The exception is , Because the end of the interval is not specified ).
therefore , The above code is printed as follows :
str1: hello
str2: world
str3: hello in summary , String slicing through : The start and end point indexes of the connection slice the string , The number before the colon represents the starting point , Null means from 0 Start , The next number represents the end point , Null means to the end of the string , Not the length of the substring . therefore str[:] Will print out the complete string .
Besides Go String also supports string comparison 、 Contains the specified character / Substring 、 Gets the specified substring index position 、 String substitution 、 toggle case 、trim Wait for the operation
String traversal
Go The language supports two ways to traverse strings .
A is Byte array The way to traverse :
str := "Hello, The world "
n := len(str)
for i := 0; i < n; i++ {
ch := str[i] // Take the characters in the string according to the subscript ,ch The type is byte
fmt.Println(i, ch)
}The output of this example is :
0 72
1 101
2 108
3 108
4 111
5 44
6 32
7 228
8 184
9 150
10 231
11 149
12 140It can be seen that , The length of this string is 13, Although intuitively , This string should only have 9 Characters . This is because each Chinese character is in UTF-8 Middle occupancy 3 Bytes , instead of 1 Bytes .
The other is to Unicode character Traverse :
str := "Hello, The world "
for i, ch := range str {
fmt.Println(i, ch) // ch The type of rune
}The output is :
0 72
1 101
2 108
3 108
4 111
5 44
6 32
7 19990
10 30028 This is the time , What's printed is 9 A character. , Because in order to Unicode When traversing in character mode , The type of each character is rune, instead of byte.
You may be a little confused when you see here , Will be curious Go How does the bottom layer store strings , Why do different traversal methods get different results ? Now let's give you a simple break .
Underlying character types
Go Language provides separate type support for individual characters in strings , stay Go Two character types are supported in the language :
- One is
byte, representative UTF-8 The value of a single byte in the encoding ( So is ituint8Alias for type , The two are equivalent , Because it just occupies 1 Bytes of memory space ); - The other is
rune, Represents a single Unicode character ( So is itint32Alias for type , Because it just occupies 4 Bytes of memory space . AboutruneRelated operations , Can refer to Go Standard library unicode package ).
UTF-8 and Unicode The difference between
Speaking of this , We need to distinguish between UTF-8 and Unicode The difference between .
Unicode It's a character set , It includes all characters of all languages in the world , Similar terms include ASCII Character set ( Contains only 256 Characters )、ISO 8859-1 Character set, etc. ( Contains all western Latin letters ), The generalized Unicode Both contain the character set , It also contains coding rules , such as UTF-8、UTF-16、UTF8MB4、GBK etc. .
therefore UTF-8 yes Unicode One of the implementation methods of character set , It will be Unicode Characters are encoded in some way . In the specific implementation ,UTF-8 Is a variable length coding rule , from 1~4 Different bytes , For example, the English characters are 1 Bytes , The Chinese characters are 3 Bytes . adopt UTF-8 Coded Unicode Characters with maximum length 4 Bytes as a fixed memory space occupied by a single character , stay Go In language, you can use unicode/utf8 Package progress UTF-8 and Unicode Conversion between .
So if you go from Unicode From the perspective of character set , Each character of a string is an independent unit of a character , But if from UTF-8 From a coding perspective , A character may be encoded by more than one byte .
We go through len The function gets the byte length of the string , According to this, when traversing a string through a character array , In order to UTF-8 From the perspective of coding ; And when we pass range Keyword when traversing a string , Again from Unicode From the angle of character set , So we get different results .
For the sake of simplifying the language ,Go Most languages API All assume that the string is UTF-8 code .
take Unicode Encoding into printable characters
If you want to Unicode The character encoding is converted to the corresponding character , have access to string Function to convert :
str := "Hello, The world "
for i, ch := range str {
fmt.Println(i, string(ch))
}The corresponding print results are as follows :
0 H
1 e
2 l
3 l
4 o
5 ,
6
7 the
10 world UTF-8 Coding cannot be transformed like this , English characters are OK , Because an English character is a byte , Chinese characters are garbled , Because a Chinese character encoding needs three bytes , Converting a single byte will cause garbled code .
边栏推荐
- 6月底了,可以开始做准备了,不然这么赚钱的行业就没你的份了
- Final review -php learning notes 4-php custom functions
- Commands and permissions for directories and files
- F12抓包用于做postman接口测试的全过程解析
- 1162 Postfix Expression
- July 30, 2021 [wgs/gwas] - whole genome analysis process (Part I)
- 【笔记】Polygon mesh processing 学习笔记(10)
- Efga design open source framework fabulous series (I) establishment of development environment
- Deep learning -- feature point detection and target detection
- Firewall firewalld
猜你喜欢
![2021-10-29 [microbiology] a complete set of 16s/its analysis process based on qiime2 tool (Part I)](/img/9d/37c531b1b439770f69f715687685f5.jpg)
2021-10-29 [microbiology] a complete set of 16s/its analysis process based on qiime2 tool (Part I)

Why don't you know what to do after graduation from university?

Final review -php learning notes 3-php process control statement

【花雕体验】12 搭建ESP32C3之Arduino开发环境

AcrelEMS能效管理平台为高层小区用电安全保驾护航

Deep learning -- recurrent neural network

24C02
![November 21, 2021 [reading notes] - bioinformatics and functional genomics (Chapter 5 advanced database search)](/img/20/bbb6e740df96250016147c54bc6cc1.jpg)
November 21, 2021 [reading notes] - bioinformatics and functional genomics (Chapter 5 advanced database search)

深度学习——嵌入矩阵and学习词嵌入andWord2Vec

直击产业落地 | 飞桨重磅推出业界首个模型选型工具
随机推荐
Fishingprince Plays with Array
Deep learning -- Realization of convolution by sliding window
Network security and data in 2021: collection of new compliance review articles (215 pages)
Disk space, logical volume
min_ max_ Gray operator understanding
直击产业落地 | 飞桨重磅推出业界首个模型选型工具
Deep learning - goal orientation
Commands and permissions for directories and files
Directory of software
Final review -php learning notes 5-php array
Arm debug interface (adiv5) analysis (I) introduction and implementation [continuous update]
【笔记】Polygon mesh processing 学习笔记(10)
C. Fishingprince Plays With Array
Tue Jun 28 2022 15:30:29 gmt+0800 (China standard time) date formatting
Final review -php learning notes 1
想问问,炒股怎么选择证券公司?网上开户安全么?
深度学习——词汇表征
November 21, 2021 [reading notes] - bioinformatics and functional genomics (Chapter 5 advanced database search)
Wangbohua: development situation and challenges of photovoltaic industry
Deep learning -- recurrent neural network