当前位置:网站首页>String and underlying character types of go data type
String and underlying character types of go data type
2022-06-30 07:52:00 【weixin_ fifty-nine million two hundred and eighty-four thousand】
character string
Basic use
stay Go In language , String is a basic type , The default is through UTF-8 Encoded character sequence , When the character is ASCII Code time takes up 1 Bytes , Use other characters as needed 2-4 Bytes , For example, Chinese coding usually needs 3 Bytes .
Declaration and initialization
The declaration and initialization of strings are very simple , Examples are as follows :
var str string // Declare string variables
str = "Hello World" // Variable initialization
str2 := " Hello , Academician " // It can also be declared and initialized at the same time Format output
You can also use Go Language built in len() Function to get the length of the specified string , And by fmt Provided by the package Printf Format string output :
fmt.Printf("The length of \"%s\" is %d \n", str, len(str))
fmt.Printf("The first character of \"%s\" is %c.\n", str, ch)Escape character
Go Language strings do not support single quotes , String literals can only be defined in double quotes , If you want to escape a specific character , Can pass \ Realization , Just as we escaped double quotation marks and line breaks in the string above , Common characters that need to be escaped are as follows :
\n: A newline\r: A carriage return\t:tab key\uor \U :Unicode character\\: The backslash itself
therefore , The output result of the above print code is :
The length of "Hello world" is 11
The first character of "Hello world" is H. besides , You can include... In a string as follows ":
label := `Search results for "Golang":`Multiline string
For multiline strings , It can also be done through ` structure :
results := `Search results for "Golang":
- Go
- Golang
Golang Programming
`
fmt.Printf("%s", results)The results are as follows :
Search results for "Golang":
- Go
- Golang
- Golang Programming Of course , Use + Connectors are also possible :
results := "Search results for \"Golang\":\n" +
"- Go\n" +
"- Golang\n" +
"- Golang Programming\n"
fmt.Printf("%s", results)The results are the same , But you have to input many more characters , It is not as elegant as the previous one .
Immutable value type
Although the characters in the string can be accessed through array subscript :
ch := str[0] // Take the first character of the string But unlike arrays , stay Go In language , A string is an immutable value type , Once initialized , Its contents cannot be modified , Take the following example for example :
str := "Hello world"
str[0] = 'X' // Compile error The compiler will report an error similar to the following :
cannot assign to str[0]Character encoding
Go The default string in the language is UTF-8 Coded Unicode Character sequence , So it can include non ANSI character , such as 「Hello, Academician 」 Can appear in Go In the code .
But it should be noted that , If your Go Code needs to contain non ANSI character , Please note that the encoding format must be selected when saving the source file UTF-8. Especially in Windows General Editors in the lower level are saved as local codes by default , For example, China may be GBK Code instead of UTF-8, If you don't notice this, there will be some unexpected situations when compiling and running .
The encoding and conversion of strings is to process text documents ( such as TXT、XML、HTML etc. ) Is a very common requirement , however Go By default, the language only supports UTF-8 and Unicode code , For other codes ,Go The language standard library does not have built-in transcoding support .
String manipulation
String connection
Go Built in provides rich string functions , Common operations include connecting 、 Get the length and the specified characters , Getting the length and specifying the characters has been described earlier , The string connection only needs to be through + The connector is OK :
str = str + ", Application development "
str += ", Application development " // The above statement can also be abbreviated as , The effect is exactly the same in addition , Another thing to note is that if the string length is long , Need a new line , be + The connector must appear at the end of the previous line , Otherwise, an error will be reported :
str = str +
", Application development "String slice
stay Go In language , The function of obtaining substrings can be realized through string slicing :
str := "hello, world"
str1 := str[:5] // Get index 5( Not included ) Previous substring
str2 := str[7:] // Get index 7( contain ) After the string
str3 := str[0:5] // Get from index 0( contain ) To the index 5( Not included ) Between the strings
fmt.Println("str1:", str1)
fmt.Println("str2:", str2)
fmt.Println("str3:", str3)Go Slice interval can be understood by comparing the concept of interval in mathematics , It's a Left closed right away The range of , For example str[0:5] The interval corresponding to the string element is [0,5),str[:5] The corresponding interval is [0,5)( Array index from 0 Start ),str[7:] The corresponding interval is [7:len(str)]( This is a closed interval , The exception is , Because the end of the interval is not specified ).
therefore , The above code is printed as follows :
str1: hello
str2: world
str3: hello in summary , String slicing through : The start and end point indexes of the connection slice the string , The number before the colon represents the starting point , Null means from 0 Start , The next number represents the end point , Null means to the end of the string , Not the length of the substring . therefore str[:] Will print out the complete string .
Besides Go String also supports string comparison 、 Contains the specified character / Substring 、 Gets the specified substring index position 、 String substitution 、 toggle case 、trim Wait for the operation
String traversal
Go The language supports two ways to traverse strings .
A is Byte array The way to traverse :
str := "Hello, The world "
n := len(str)
for i := 0; i < n; i++ {
ch := str[i] // Take the characters in the string according to the subscript ,ch The type is byte
fmt.Println(i, ch)
}The output of this example is :
0 72
1 101
2 108
3 108
4 111
5 44
6 32
7 228
8 184
9 150
10 231
11 149
12 140It can be seen that , The length of this string is 13, Although intuitively , This string should only have 9 Characters . This is because each Chinese character is in UTF-8 Middle occupancy 3 Bytes , instead of 1 Bytes .
The other is to Unicode character Traverse :
str := "Hello, The world "
for i, ch := range str {
fmt.Println(i, ch) // ch The type of rune
}The output is :
0 72
1 101
2 108
3 108
4 111
5 44
6 32
7 19990
10 30028 This is the time , What's printed is 9 A character. , Because in order to Unicode When traversing in character mode , The type of each character is rune, instead of byte.
You may be a little confused when you see here , Will be curious Go How does the bottom layer store strings , Why do different traversal methods get different results ? Now let's give you a simple break .
Underlying character types
Go Language provides separate type support for individual characters in strings , stay Go Two character types are supported in the language :
- One is
byte, representative UTF-8 The value of a single byte in the encoding ( So is ituint8Alias for type , The two are equivalent , Because it just occupies 1 Bytes of memory space ); - The other is
rune, Represents a single Unicode character ( So is itint32Alias for type , Because it just occupies 4 Bytes of memory space . AboutruneRelated operations , Can refer to Go Standard library unicode package ).
UTF-8 and Unicode The difference between
Speaking of this , We need to distinguish between UTF-8 and Unicode The difference between .
Unicode It's a character set , It includes all characters of all languages in the world , Similar terms include ASCII Character set ( Contains only 256 Characters )、ISO 8859-1 Character set, etc. ( Contains all western Latin letters ), The generalized Unicode Both contain the character set , It also contains coding rules , such as UTF-8、UTF-16、UTF8MB4、GBK etc. .
therefore UTF-8 yes Unicode One of the implementation methods of character set , It will be Unicode Characters are encoded in some way . In the specific implementation ,UTF-8 Is a variable length coding rule , from 1~4 Different bytes , For example, the English characters are 1 Bytes , The Chinese characters are 3 Bytes . adopt UTF-8 Coded Unicode Characters with maximum length 4 Bytes as a fixed memory space occupied by a single character , stay Go In language, you can use unicode/utf8 Package progress UTF-8 and Unicode Conversion between .
So if you go from Unicode From the perspective of character set , Each character of a string is an independent unit of a character , But if from UTF-8 From a coding perspective , A character may be encoded by more than one byte .
We go through len The function gets the byte length of the string , According to this, when traversing a string through a character array , In order to UTF-8 From the perspective of coding ; And when we pass range Keyword when traversing a string , Again from Unicode From the angle of character set , So we get different results .
For the sake of simplifying the language ,Go Most languages API All assume that the string is UTF-8 code .
take Unicode Encoding into printable characters
If you want to Unicode The character encoding is converted to the corresponding character , have access to string Function to convert :
str := "Hello, The world "
for i, ch := range str {
fmt.Println(i, string(ch))
}The corresponding print results are as follows :
0 H
1 e
2 l
3 l
4 o
5 ,
6
7 the
10 world UTF-8 Coding cannot be transformed like this , English characters are OK , Because an English character is a byte , Chinese characters are garbled , Because a Chinese character encoding needs three bytes , Converting a single byte will cause garbled code .
边栏推荐
- Go 数据类型篇之字符串及底层字符类型
- STM32 infrared communication
- Multi whale capital: report on China's education intelligent hardware industry in 2022
- Final review -php learning notes 4-php custom functions
- Cadence innovus physical implementation series (I) Lab 1 preliminary innovus
- 【花雕体验】13 搭建ESP32C3之PlatformIO IDE开发环境
- Palindrome substring, palindrome subsequence
- HelloWorld
- 冰冰学习笔记:快速排序
- Deep learning -- using word embedding and word embedding features
猜你喜欢
![December 4, 2021 [metagenome] - sorting out the progress of metagenome process construction](/img/03/eb6e6092922cf42c2c9866e7bb504d.jpg)
December 4, 2021 [metagenome] - sorting out the progress of metagenome process construction

CRM能为企业带来哪些管理提升

Deep learning -- Realization of convolution by sliding window

Analysys analysis: online audio content consumption market analysis 2022

Final review -php learning notes 4-php custom functions

2022 retail industry strategy: three strategies for consumer goods gold digging (in depth)

Palindrome substring, palindrome subsequence

Deep learning -- language model and sequence generation

mysql无法连接内网的数据库

期末复习-PHP学习笔记2-PHP语言基础
随机推荐
December 4, 2021 [metagenome] - sorting out the progress of metagenome process construction
2021 private equity fund market report (62 pages)
Construction of module 5 of actual combat Battalion
深度学习——特征点检测和目标检测
Tue Jun 28 2022 15:30:29 GMT+0800 (中国标准时间) 日期格式化
The counting tool of combinatorial mathematics -- generating function
Deep learning -- recurrent neural network
Arm debug interface (adiv5) analysis (I) introduction and implementation [continuous update]
Go 数据类型篇之字符串及底层字符类型
342 maps covering exquisite knowledge, one of which is classic and pasted on the wall
架构实战营模块 5 作业
Firewall firewalld
Cross compile opencv3.4 download cross compile tool chain and compile (3)
Deep learning -- sequence model and mathematical symbols
December 13, 2021 [reading notes] | understanding of chain specific database building
1162 Postfix Expression
鲸探NFT数字臧品系统开发技术分享
Lexicographic order -- full arrangement in bell sound
At the age of 25, I started to work in the Tiankeng industry with buckets. After going through a lot of hardships to become a programmer, my spring finally came
Digital white paper on total cost management in chain operation industry