当前位置:网站首页>Character string and memory operation function in C language
Character string and memory operation function in C language
2020-11-06 01:16:00 【South of the city Hua has opened】
C Character string and memory operation function in language
1 Characters and their operation functions
1.1 character
Character type char yes C A very important type of language , Compared to integers , Floating point operations are slightly different , I'll introduce you today C Things about characters in language .
The characters we are talking about here refer to the American Standard Code for information interchange (American Standard Code for Information Interchange, Hereinafter referred to as" ASCII code ) The characters in the table , According to the table , Each character corresponds to a number , Such as character 'a' Of ASCII The code number is 97, character 'A' Of ASCII The code number is 65, character '1' Of ASCII The code number is 49 wait . Because computers can only store binary code , The character is actually stored in memory ASCII The binary code of the code , So we can also think that :char Equivalent to 1 An unsigned integer of bytes .
chart 1.1 ASCII clock
Because some characters or commands cannot be expressed directly ( For example, carriage return , Again C To define a character in a language, you need to use single quotation marks to enclose the character , And the single quotation mark itself is a character ), In this case, we need to use the escape character to express , It is written in the form of “ The backslash is followed by the specified character ”, In this case, the character after the backslash in the escape character will no longer represent its original meaning . for example 'n' The original meaning of English characters is English characters 'n', If you add a backslash :'\n', At this point, the compiler will put the backslash and n Put it together and compile , The corresponding meaning is line break .
Common escape characters and their corresponding meanings ( Source: Baidu Encyclopedia < Escape character >):
Escape character |
meaning |
ASCII Code value |
\a |
Ring the bell (BEL) |
007 |
\b |
Backspace (BS), Move the current position to the previous column |
008 |
\f |
Change the page (FF), Move the current position to the beginning of the next page |
012 |
\n |
Line break (LF), Moves the current position to the beginning of the next line |
010 |
\r |
enter (CR), Move the current position to the beginning of the line |
013 |
\t |
Horizontal TAB (HT) ( Skip to the next TAB Location ) |
009 |
\v |
Vertical tabulation (VT) |
011 |
\\ |
Represents a backslash character ''\' |
092 |
\’ |
Represents a single quotation mark ( apostrophe ) character |
039 |
\” |
Represents a double quote character |
034 |
\? |
Represents a question mark |
063 |
\0 |
Null character (NULL) |
000 |
\ddd |
1 To 3 Any character represented by an octal number |
Three octal |
\xhh |
Any character represented by hexadecimal |
Hexadecimal |
For Chinese characters , The number of bytes of Chinese characters corresponding to different codes is different , Therefore, this paper does not discuss .
1.2 Character manipulation functions
C There are two main types of character functions in language , One is character classification function , It is often used to judge whether the user's input is legal , The other is character conversion function , Used to convert the characters of English letters to uppercase or lowercase .
1.2.1 Character classification function
Common character classification functions are shown in the table below :
function |
This function returns true if its parameters meet the following conditions , Otherwise return false |
iscntrl |
Any control character |
isspace |
Blank character : Space ’ ’, Change the page ’\f, Line break ’\n’, enter ’\r’, tabs ’\t’, Vertical tabs ’\v’. |
isdigit |
Decimal number 0~9 |
isxdigit |
Hexadecimal number , Including all the decimal numbers , Lowercase letters a~f, Capital A~F |
islower |
Lowercase letters a~z |
isupper |
Capital A~Z |
isalpha |
Letter a~z or A~Z |
isalnum |
Letters or Numbers ,0~9,a~z,A~Z |
ispunct |
Punctuation , Any character that is not a number or a letter ( Printable ) |
isgraph |
Any graphic character |
isprint |
Any printable character , Including graphic characters and white space characters |
notes : stay ASCII code in , The first 0~31 No. and 127 Number ( common 33 individual ) It's a control character or communication specific character , Such as the controller :LF( Line break )、CR( enter )、FF( Change the page )、DEL( Delete )、BS( Backspace )、BEL( Ring the bell ) etc. ; Communication special characters :SOH( Title )、EOT( Epilogue )、ACK( confirm ) etc. .
1.2.2 Character conversion function
Turn lowercase :
int tolower(int c);
Turn capitalization :
int toupper(int c);
for example :
char s = 'a'; printf("%c\n",toupper(s)); printf("%c\n",s);
chart 1.2
From the program running results can see that , When converting characters , The character itself has not changed , It's just that the character conversion function capitalizes the corresponding ( Or lowercase ) Corresponding ASCII Code value return . If you want to change the uppercase or lowercase of a string , Just traverse the string , Use the character conversion function .
2 String and its operation function
2.1 character string
Strictly speaking ,C There is no string type in the language , So we use character arrays to simulate strings , Or use the constant string directly . Since it's an array of characters to simulate a string , And the characters use ASCII Code is stored in memory , When to stop ? How does the compiler know where the end of a string is .C Language standards have the following provisions : With '\0' As the end of a string .
There are two ways to define strings :
The way 1:
char str1[] = "Hello";
although Hello Yes 5 Characters , But in fact, the system will automatically in the character 'o' And then add the character '\0'. As shown in the figure below .
chart 2.1
The way 2:
char str2[6] = {'H','e','l','l','o','\0'};
For this way of definition , You have to add... Manually at the end '\0', Otherwise, we define an array of characters, not strings .
2.2 String function
C In language , String related functions are as follows :
Function name |
meaning |
strlen |
Get string length |
strcpy |
String copy |
strcat |
String concatenation function |
strcmp |
String comparison function |
strncpy |
String specifies the number of characters copied |
strncat |
String specifies the number of characters to be spliced |
strcnmp |
String specifies the character comparison function |
strstr |
Determine whether a string is a fragment of another string |
strtok |
Splits a string by the specified separator |
strerror |
Error message reporting function |
Next, we will introduce and simulate some of the functions one by one .
2.2.1strlen
strlen: Find string length function .
size_t strlen( const char *string );
It can be seen that , The return value of this function is an unsigned integer , So in practice, we can't directly deal with two strlen Subtract , Otherwise it will go wrong .
strlen The function of is to find the length of a string , We have already mentioned , At the end of the string with '\0' As an end sign , So we just need to traverse from the beginning of the string , When it comes to '\0' Automatically stop when , Then return '\0' The number of characters before .
So you can write the following code :
size_t my_strlen1( const char *str ) { assert(str); int count = 0; while (*str++ && ++count); return count; }
perhaps
size_t my_strlen2( const char *str ) { assert(str); const char *start = str; while (*str++); return (str - 1 - start);// reduce 1 Because in the previous step, the pointer pointed to '\0' after , Although the condition is not satisfied, the loop is exited , but str And then we're going to do a step-by-step addition operation , So subtract 1. }
Or not using temporary variables :
size_t my_strlen3( const char *str ) { assert(str); if (*str == '\0') { return 0; } return my_strlen3(str+1)+1; }
2.2.2strcpy
strcpy: String copy function :
char *strcpy( char *strDestination, const char *strSource );
Its meaning is to put the source string strSource Copy to destination string strDestination In the middle .
This function has the following precautions :
First, make sure that the destination string has enough space , Able to drop the source string , The destination string space should be at least as large as the source space , Besides , The target space should also be modifiable ( Don't be const modification ).
The source string must be '\0' end , Otherwise it will go wrong .
Of the source string '\0' Will be copied to the target space , As the end of a string .
return char* It is to realize chain access of functions .
Simulation Implementation :
char* my_strcpy( char *dest, const char *src ) { assert(dest); assert(src); char* ret = dest; while(*dest++ = *src++); return ret; }
2.2.3strcat
strcat: String concatenation function
char *strcat( char *strDestination, const char *strSource );
Its meaning is to put the source string strSource Concatenate to the target string strDestination after .
This function has the following precautions :
The target string has enough space to hold the source string .
The source string must be '\0' ending .
Append from the target string '\0' Starting at position , That is to say '\0' overwrite , So a string cannot append itself to itself .
Simulation Implementation :
char* my_strcat( char *dest, const char *src ) { assert(dest); assert(src); char* ret = dest; while (*dest++); dest--; while (*dest++ = *src++); return ret; }
2.2.4strcmp
strcmp: String comparison function
int strcmp( const char *string1, const char *string2 );
The string itself has no size , Here we compare the characters of two strings ASCII The size of the code value , That is, if string1 First character of ASCII The code value is greater than string2 First character of ASCII Code value , Return to a greater than 0 Number of numbers , If string1 First character of ASCII The code value is less than string2 First character of ASCII Code value , Just return to a less than 0 Number of numbers , If string1 First character of ASCII The code value is equal to string2 First character of ASCII Code value , And then compare their second character , And so on .
Simulation Implementation :
int my_strcmp( const char *str1, const char *str2 ) { assert(str1); assert(str1); while (*str1 == *str2) { if (*str1 == '\0') { return 0; } str1++; str2++; } return *str1 - *str2; }
2.2.5strcpy
strncpy: String specifies the number of characters copied
char *strncpy( char *strDest, const char *strSource, size_t count );
Its meaning is to put the source string of count Copy characters to the target string space .
When using this function, you should pay attention to the following :
If the source string length is less than count, After copying the source string , Add... After the target '\0', Until you add count individual .
count Should not exceed the target string space ( Because the string ends with '\0', therefore count Should be less than the space of the target string ).
Simulation Implementation :
char* my_strncpy( char *dest, const char *src, size_t n) { assert(dest); assert(src); char* ret = dest; while(n && (*dest++ = *src++)) { n--; } if(n) { while (--n) { *dest++ ='\0'; } } return ret; }
2.2.6strncat
strncat: String specifies the number of characters to be spliced
char *strncat( char *strDest, const char *strSource, size_t count );
Its meaning is to put the source string of count Characters are appended to the target string .
When using this function, you should pay attention to the following :
The target string must be '\0' ending .
When appending from the target string '\0' Start adding ,count At the end of appending strings , Fill in the back '\0'.
count Should not exceed the remaining space of the target string .
If the source string is not long enough count individual , Add... At the end '\0', Until it's full count Up to .
Simulation Implementation :
char *my_strncat( char *dest, const char *src, size_t n ) { assert(dest); assert(src); int i; char* ret = dest; while(*dest) { dest++; } for(i=0;src[i] && i<n;i++) { dest[i] = src[i]; } while(i <= n) { dest[i] = '\0'; i++; } return ret; }
2.2.7strncmp
strncmp: Compare two strings before n Characters
int strncmp( const char *string1, const char *string2, size_t count );
Compare two strings before count character , The principle is the same as strcmp.
2.2.8strstr
strstr: Determine whether a string is a substring of another string .
char *strstr( const char *string, const char *strCharSet );
It means judgment strCharSet Is it string Substring of . The return value is a pointer , If it's not a substring , Returns a NULL The pointer , If it is , Then return to strCharSet stay string The first place in .
Realization principle , from string The first character of is related to strCharSet Compare the first character of , If it's not equal , On the comparison string The second character of and strCharSet First character of , If equal , Compare string The third character of and strCharSet Second character of , If it's not equal , From string The third character of begins with strCharSet Compare the first character of , And so on .
Simulation Implementation :
char *my_strstr( const char *str1, const char *str2) { const char* s1 = str1; const char* s2 = str2; const char* cp = str1; if(*str2 == '\0') { return str1; } while(cp) { s1 = cp; s2 = str2; while(s1 && s2 && *s1 = *s2) { s1++; s2++; } if(*s2 = '\0') { return (char*)cp; } cp++; } return NULL; }
2.2.9strtok
strtok: Splits a string by the specified separator
char *strtok( char *strToken, const char *strDelimit );
It means according to strDelimit To split the characters in a string strToken.
When using this function, you should pay attention to :
strToken Contains 0 One or more by strDelimit A token separated by one or more separators in a string .
strtok Function found strToken The next mark in , Change the mark to '\0', And return a pointer to the substring .
If strtok The first argument of the function is not NULL, Function will find str The first mark in ,strtok Function will hold its position in the string . And return the starting address of the separated string .
If strtok The first argument to the function is NULL, The function will start at the same position in the string that is saved , Find next tag .
If there are no more tags in the string ( That is, all the tags have been searched ), Then return to NULL The pointer .
Use cases :
char str1[] = "123.456.55.88"; char str2[] = "."; char* p = NULL; char str3[50]; strcpy(str3,str1); for (p = strtok(str3, str2); p != NULL; p = strtok(NULL, str2)) { printf("%s\n",p); }
2.2.10strerror
strerror: Return the error message corresponding to the error code
char* strerror(int errnum)
When writing a program , There are always some situations that we don't think well of , In some parts of the program that might go wrong , We can set some error prompts in advance , So when the program is running , It can help us quickly locate the wrong place , Make the program easier to debug . Some errors and error codes are defined in advance in the system , These error codes are placed in global variables errno( A reference header file is required errno.h) in . We only need to call the above function when using it , If an error occurs , The error code and its corresponding information will be returned for us , When there are no mistakes ,errno The default value is 0.
for example : Before opening a file , We need to determine if the file exists , If it doesn't exist, it can't be opened , At this point, you can call the function .
FILE* pFile; pFile = fopen("1.txt","r"); if (pFile == NULL) { printf("Error opening file 1.txt:%s\n", strerror(errno)); }
Because in practice , There is no such document , So the output file does not exist , As shown in the figure .
chart 2.2
There's another function perror, Integrated printing and strerror The function of , So in the above line of code printf The corresponding line can be rewritten as :
perror("Error opening file 1.txt")
Both output the same .
3 Memory (memory) Operation function
In string operations , There are string comparisons , Copy , Splicing and so on , But it can only implement string operations , Often it's also subject to its Terminator '\0' The limitation of , When we want to copy comparisons or other types , These functions don't work , So we introduce the memory operation function here , It's similar to string manipulation functions , But it's not the same .
3.1memcpy
memcpy: String copy function
void *memcpy( void *dest, const void *src, size_t count );
It means that the function will change from src The corresponding starting address starts to copy backward count Bytes of data to dest In the address pointed to . Copy end , return dest From .
Because it is a direct copy of memory , So it can copy any type of data , Of course, it's not '\0' The limitation of , That is, it doesn't stop when it encounters the character , It's copying , Until the copy is full count Up to bytes .
When dest stay src+count Within the range of , Then the copy result may not be correct ( For different platforms, this function is implemented in different ways , If you copy and paste at the same time , The overlapped area will be covered by new data , Copy results may not be what we expected , And if you copy it first and then paste it , Can be what we expect ).
Simulation Implementation :
void *my_memcpy1( void *dest, const void *src, size_t num ) { assert(dest); assert(src); void* ret = dest; int i; for (i = 0; i < num; i++) { *((char*)dest+i) = *((char*)src+i); } return ret; }
perhaps
void *my_memcpy2( void *dest, const void *src, size_t num ) { assert(dest); assert(src); void* ret = dest; while(num--) { *((char*)dest) = *((char*)src); dest = (char*)dest+1; src = (char*)src+1; } return ret; }
3.2memmove
memmove: Memory move
void *memmove( void *dest, const void *src, size_t count );
seeing the name of a thing one thinks of its function , Memory move , Is to put src Starting back count Bytes of memory copy moved to dest In the corresponding position . and memcpy The function is the same , Are all the src Starting back count Bytes of memory copy moved to dest In the corresponding position , But it also talked about , When dest stay src+count Range ( perhaps src stay dest+count Within the scope of ) Inside , That is, when the source space and the target space overlap ,memcpy There is no guarantee that the copy result is correct ,memmove Function is to solve this problem .
analysis , When dest stay src+count When in range ( That is to say dest stay src To the right of ), As shown in the figure :
chart 3.1
D For overlapping areas , If you copy from front to back , That is, first A copy to D It's about , It will change the original D The data of the location is covered , Then put D Copy the data to G when , It's actually a copy of A The data of . If you copy from back to front , That is, first D Copy the data at to G, And then C Copy the data at to F, By analogy , At this point we can achieve the results we want .
And when src stay dest+count Within the scope of ( namely dest stay src Left side ), As shown in the figure , According to the above analysis, it should be copied from front to back , That is, first D Copy the data to A in , And then E Copy the data at to B in , By analogy .
chart 3.2
With the above analysis , Simulation Implementation ( In fact, it's mainly judgment dest Is in src Left or right ):
void *my_memmove( void *dest, const void *src, size_t num ) { assert(dest); assert(src); void* ret = dest; if (dest < src) { while{num--}// Copy from front to back { *((char*)dest) = *((char*)src); dest = (char*)dest+1; src = (char*)src+1; } } else { while{num--}// Copy from back to front { *((char*)dest+count) = *((char*)src+count); } } return ret; }
3.3memcmp
memcmp: Comparison function
int memcmp( const void *buf1, const void *buf2, size_t count );
It means comparing memory areas buf1 And buf2 Of count The size of bytes .
If buf1>buf2, Returns a positive number ;
If buf1=buf2, return 0;
If buf1<buf2, Return negative ;
Simulation Implementation :
int my_memcmp( const void *buf1, const void *buf2, size_t num ) { assert(buf1); assert(buf2); while(*((*char)buf1) == *((*char)buf2)&& count--) { buf1 = (*char)buf1+1; buf2 = (*char)buf2+1; } if (count == 0) { return 0; } return *((char*)buf1) - *((char*)buf2) }
3.4memset
memset: initialization
void *memset( void *dest, int c, size_t count );
It means from dest Position start , The next count Set bytes to integers c, And finally back to dest The address of .
Because it's assigned by byte , So no matter c How big? , The system can only take c The last eight bits of the binary code are assigned to it , That's why , It's usually used 0( The binary code is complete 0) or -1( The binary code is complete 1) To initialize , Otherwise, it is easy to make mistakes .
版权声明
本文为[South of the city Hua has opened]所创,转载请带上原文链接,感谢
边栏推荐
猜你喜欢
随机推荐
幽默:黑客式编程其实类似机器学习!
(2)ASP.NET Core3.1 Ocelot路由
Don't go! Here is a note: picture and text to explain AQS, let's have a look at the source code of AQS (long text)
业内首发车道级导航背后——详解高精定位技术演进与场景应用
01 . Go语言的SSH远程终端及WebSocket
GBDT与xgb区别,以及梯度下降法和牛顿法的数学推导
A debate on whether flv should support hevc
制造和新的自动化技术是什么?
03_ Detailed explanation and test of installation and configuration of Ubuntu Samba
WeihanLi.Npoi 1.11.0/1.12.0 Release Notes
嘗試從零開始構建我的商城 (二) :使用JWT保護我們的資訊保安,完善Swagger配置
Want to do read-write separation, give you some small experience
Analysis of ThreadLocal principle
JetCache埋点的骚操作,不服不行啊
Existence judgment in structured data
Top 10 best big data analysis tools in 2020
htmlcss
用Keras LSTM构建编码器-解码器模型
How to demote a domain controller in Windows Server 2012 and later
快快使用ModelArts,零基礎小白也能玩轉AI!