String Underlying implementation


string stay C++ It is also an important knowledge , But I want to use it well , You need to know how its bottom layer is written , In order to better use this string, So let's do it this time string The bottom of the , however string There are many interface functions , We cannot achieve it all at once , The main thing to realize this time is its common interface , And important interfaces The functions realized this time are :string Constructor for , copy constructor , Destructor , iterator ( It is divided into const And non const Two versions ),reserve,push_back,append,+=( It is divided into += Characters and strings ),clear,swap, Stream insertion , Stream output ,c_str,size,capacity,empty,resize,[] overloaded ( Not for const And non const Two versions ),<,<=,>,>=,==,!= overloaded ,find( It is divided into two versions: search character and string )insert( There are two versions of insert character and insert string ),erase.( As for how many have been achieved , Not here , Quite a lot of ...


First, get ready :  Because it needs to be implemented separately string The bottom of the , Therefore, in order to avoid the connection with string Conflict , So we encapsulate it in a separate namespace , Secondly, its four private members :_str( Store string ),_capacity( Record the capacity ),_size( Record valid characters ),npos( It is needed in some functions )


One : Constructors

1 Its effective number and size is its length , use strlen Calculation is enough .

2 Open a space ( here +1 To give \0 Reserved location   When opening space, you must give \0 One more )

3 Copy the string to the open space

1         string(const char* str = "")
2 :_size(strlen(str))
3 , _capacity(_size)
4 {
5 _str = new char[_capacity + 1];//+1 It's for '\0' Leave a place string Open space must be more than one position
6 strcpy(_str, str);
7 }

Two : Copy structure

Copy structures are divided into : Traditional writing and modern writing

The traditional way of writing : The structure is the structure , The copy is the copy

1 Its effective number and size is its length , use strlen Calculation is enough .

2 Open a space ( to \0 Open one more space )

3 Copy the string to this space

4 Assign this space to _str

5 It's size And capacity And s Sync

 1         string(const string& s)
2 :_size(strlen(s._str))
3 ,_capacity(_size)
4 {
5 char* tmp = new char[_capacity + 1];
6 strcpy(tmp, s._str);
7 _str = tmp;
8 _size = s._size;
9 _capacity = s._capacity;
10 }

Modern writing : utilize tmp Copy , Let them Exchange

1 Because the copy structure is copied to a nonexistent , So initialize them first , So that they can exchange

2 utilize tmp Construct a string that needs to be copied

3 let _str And tmp In exchange for

        string(const string& s)
:_str(nullptr)
,_size(0)
,_capacity(0)
{
string tmp(s._str);
swap(tmp);
}

( Modern writing is more recommended here )

3、 ... and : Destructor

1 When str Not empty

2 Release , And leave it empty , Then set its valid characters and size 0

1         ~string()
2 {
3 if (_str)
4 {
5 delete[] _str;
6 _str = nullptr;
7 _size = _capacity = 0;
8 }
9 }

Four : Assignment overload

It is divided into traditional writing and modern writing ① and ②

The traditional way of writing

1 When you don't assign values to yourself

2 Open a space

3 Copy the string into the space

4 release _str Space

5 hold tmp Space assigned to _str

6 And then _str The valid characters and size of the assigned string are the same

7 return

 1         string& operator=(const string &s)
2 {
3 if (this != &s)
4 {
5 char* tmp = new char[s._capacity + 1];
6 strcpy(tmp, s._str);
7 delete[] _str;
8 _str = tmp;
9 _size = s._size;
10 _capacity = s._capacity;
11 }
12 return *this;
13 }

Modern writing ①

1 utilize tmp Construct a string

2 Give Way _str And tmp In exchange for

3 return

1         string& operator=(const string &s)
2 {
3 string tmp(s._str);
4 swap(tmp);
5 return *this;
6 }

Modern writing ②

1 There's something special here , Because it needs direct exchange without change s The data of , There is no need to quote , Instead, it returns by passing a value

2 hold _str Exchange with string

3 return

1         string& operator=(string s)
2 {
3 swap(s);
4 return *this;
5 }

5、 ... and :swap

Because it is necessary to complete the exchange operation of three private members , therefore swap Exchange three private members directly in

Here we need to use the Library , But the compiler will default to use the swap, So we need to add std, Use the Library swap

1         void swap(string& s)
2 {
3 std::swap(_str, s._str);
4 std::swap(_size, s._size);
5 std::swap(_capacity, s._capacity);
6 }

6、 ... and :c_str

Because stream insertion has not been implemented , So the demonstration can print with this function

Because this function just prints , Without changing the string , So add const, Prevent unexpected changes

1         const char* c_str()const
2 {
3 return _str;
4 }

7、 ... and :reserve

This function is specially used for capacity expansion .

1 When the required value is greater than space , It needs to be expanded

2 Create a space ( by \0 Open one more space )

3 hold _str Copy to new space

4 And then _str Space release

5 Assign the new space to _str

6 Put the new size capacity Change to the expanded size

 1         void reserve(size_t n)
2 {
3 if (n > _capacity)
4 {
5 char* tmp = new char[n + 1];
6 strcpy(tmp, _str);
7 delete[] _str;
8 _str = tmp;
9 _capacity = n;
10 }
11 }

8、 ... and :push_back

This function is used to insert a character at the end

1 First judge whether the space is full

2 utilize reserve Open space ( There are special circumstances here : Sometimes the space we open is 0  Then continue with 2 Double expansion , It cannot be expanded (0*n=0)  So when capacity by 0 when   Capacity expansion 4  If not for 0  Double expansion

3 After capacity expansion or when capacity expansion is not required   Add the characters to be added in the place of valid characters

4 Then move the valid characters back

5 Add again \0  Otherwise, there is no \0  Can't stop

 1         void push_back(char c)
2 {
3 if (_size == _capacity)
4 {
5 reserve(_capacity == 0 ? 4 : _capacity * 2);
6 }
7 _str[_size] = c;
8 ++_size;
9 _str[_size] = '\0';
10 }

Nine :+= heavy load

In practical use ,+= With the function of push_back identical , also += It is more convenient , So we need to provide += The function of

1 Because the actual function is the same , It can be reused directly push_back that will do

2 Because you need to support continuous += So we need to go back

1         string& operator+=(char c)
2 {
3 push_back(c);
4 return *this;
5 }

Ten :append

This function is used to add a string

1 Calculate the valid characters and the length of the string to be added

2 If the length of the valid character and the string to be added is greater than the actual space

3 So we need to expand , Expand string length + Valid character length is enough

4 Copy the string to the end of the valid character

6 The valid characters are also changed to valid characters and the length of the string to be added

 1         void append(const char* str)
2 {
3 int len = _size+strlen(str);
4 if (len > _capacity)
5 {
6 reserve(len);
7 }
8 strcpy(_str+_size, str);
9 _size = len;
10 }

11、 ... and :+= overloaded

+= The addition of strings is also better than append Use more times , So we also need to provide

The actual functions here are the same , So reuse directly

1 Because you want to support continuous +=, So we need to go back

1         string& operator+=(const char* str)
2 {
3 append(str);
4 return *this;
5 }

Twelve :clear

Used to clear data

Clear data , But there is no room to shrink , Just clear the data in the space , This is something to be aware of

1 Add... In the first place \0  When printing \0  Will stop , The following data cannot be printed

2 The effective number is modified to 0

1         void clear()
2 {
3 _str[0] = '\0';
4 _size = 0;
5 }

13、 ... and : Provide query size

This function provides private members size Size

Because there is no need to modify , So add const

1         size_t size()const
2 {
3 return _size;
4 }

fourteen : Provide query capacity

This function provides private members capacity Size

Because there is no need to modify , So add const

1         size_t capacity()const
2 {
3 return _capacity;
4 }

15、 ... and :empty

It provides the function of judging empty

1 When _str When the string is empty   return true

2 When _str When it is not an empty string   return false

 1         bool empty()const
2 {
3 if (_str == "")
4 {
5 return true;
6 }
7 else
8 {
9 return false;
10 }
11 }

sixteen :resize

function : Capacity expansion , And initialize

When the size to be expanded , Smaller than the actual size , Then it is necessary to put the actual size - To expand the unused space, clear the data

1 The size of Dangdang to be expanded , Smaller than the actual size

2size Change to the size to be expanded

3 Add \0

4 When the size to be expanded is larger than the actual size

5 Capacity expansion n Size

6 Start with the actual valid characters , Until the actual space size ends

7 All revised to c ( If you don't pass parameters , The default is \0)

8 The actual valid characters are changed to n

9 Change the valid characters to \0

 1         void resize(size_t n, char c = '\0')
2 {
3 // When n Less than size when
4 if (n < _size)
5 {
6 _size = n;
7 _str[_size] = '\0';
8 }
9 else// When n Greater than size when
10 {
11 if (n > _capacity)
12 {
13 reserve(n);
14 }
15
16 for (size_t i = _size; i < n; i++)
17 {
18 _str[i] = c;
19 }
20 _size = n;
21 _str[_size] = '\0';// Add characters Manually add \0
22 }
23 }

seventeen :[] heavy load

This function is divided into non const Version and const edition

Not const edition

1 The size to be accessed cannot exceed valid characters   Need to assert

2 return _str The value of the corresponding subscript

1         char& operator[](size_t index)
2 {
3 assert(index < _size);
4
5 return _str[index];
6 }

const edition

In some places, the access subscript does not need to be modified , So we need to provide const edition

1         const char& operator[](size_t index) const
2 {
3 assert(index < _size);
4
5 return _str[index];
6 }

eighteen :<,<=,>,>=,==,!= overloaded

Use here strcmp() function   if _str<s._str  return <0 Value   Equal return 0  Greater than return >0 Value

And other functions can be reused directly

 1         bool operator<(const string& s)
2 {
3 return strcmp(_str, s._str) < 0;
4 }
5
6 bool operator<=(const string& s)
7 {
8 return _str < s._str || strcmp(_str, s._str) == 0;
9 }
10
11 bool operator>(const string& s)
12 {
13 return strcmp(_str, s._str) > 0;
14 }
15
16 bool operator>=(const string& s)
17 {
18 return _str>s._str|| strcmp(_str, s._str) == 0;
19 }
20
21 bool operator==(const string& s)
22 {
23 return strcmp(_str, s._str) == 0;
24 }
25
26 bool operator!=(const string& s)
27 {
28 return !(_str == s._str);
29 }

nineteen :find

function : return c stay string The first place in

1 from pos Start looking at the location , If characters are found c  return , If the cycle ends , It means no , return npos( representative -1)

 1         size_t find(char c, size_t pos = 0) const
2 {
3 assert(pos <= _size);
4 for (; pos <= _size; pos++)
5 {
6 if (_str[pos] == c)
7 {
8 return pos;
9 }
10 }
11 return npos;
12 }

twenty :find

Function returns substring s stay string The first place in

Use here strstr function , Find string s

If not found   Returns an empty   You need to determine

If it is empty , return npos

If not be empty   Return to where it first appeared

First occurrence -_str(*)

 1         size_t find(const char* s, size_t pos = 0)
2 {
3 const char* p = strstr(_str + pos, s);
4 if (p == nullptr)
5 {
6 return npos;
7 }
8 else
9 {
10 return p - _str;
11 }
12 }

The 21st :insert

function : stay pos Insert character in position c, And return the position of the character

1 First judge whether the space is full

2 When it is full, it needs to be expanded ,  A special case   When capacity by 0 when   Capacity expansion 4  When not for 0 when ,2 Double expansion

3 Define index end Start after valid characters (\0 Later on   Valid characters are recorded \0)

4 Move a number back , end Forward , Continue to move the next number

5 When you reach the position where you need to insert , stop it

6 At the position to be inserted   Insert characters c

7 Valid characters +1

8 return

        string& insert(size_t pos, char c)
{
assert(pos <= _size);
if (_size == _capacity)
{
reserve(_capacity == 0 ? 4 : _capacity * 2);
} size_t end = _size + 1;
while (end > pos)
{
_str[end] = _str[end-1];
--end;
}
_str[pos] = c;
++_size; return *this;
}

Twenty-two insert

function : stay pos Insert string at position str, And return the position of the character

1 When the position to be inserted is greater than the valid character   It needs to be asserted that

2 Calculate the length of the added string

3 If the length of valid characters plus the added string is greater than the actual size

4 The length of the expanded valid characters plus the added string is greater than the actual size

5 Define index end From valid characters +len ( This is to have enough space to insert strings )

6 When end here we are pos+len-1 Stop where you are

7 Move each character len A place

8end--  Continue to move to the next position

9 use strncpy function , Copy the string in

10 Valid characters plus the length of the string

11 return

 1         string& insert(size_t pos, const char* str)
2 {
3 assert(pos <= _size);
4 size_t len = strlen(str);
5 if (len+_size > _capacity)
6 {
7 reserve(len+_size);
8 }
9
10 size_t end = _size + len;
11 while (end>pos+len-1)//-1 It's because of one \0
12 {
13 _str[end] = _str[end-len];
14 end--;
15 }
16 strncpy(_str + pos, str, len);
17 _size += len;
18
19 return *this;
20 }

23 :erase

1 When the position to be deleted is greater than the valid character   To assert

2 When the position is not given or the length to be deleted is greater than the valid characters

3 Directly in pos Position plus \0  Delete directly pos Later data

4 The valid characters are modified to pos

5 If it is not greater than the valid character , When you want to delete some strings

6 Give Way begin from pos+len( Start at the end of the string to be deleted )

7 Cover the front   from begin-len Start overlay at )

8begin++  Continue to begin The number of covers the front

9 until begin There are valid characters

10 Valid characters minus the length of the string to be deleted

11 Add \0  Always delete , Add characters these cannot be added automatically \0 Of   Must be added manually \0

12 return

 1         string& erase(size_t pos, size_t len=npos)
2 {
3 assert(pos < _size);
4 if (pos == npos || pos + len > _size)
5 {
6 _str[pos] = '\0';
7 _size = pos;
8 }
9 else
10 {
11 int begin = pos + len;
12 while (begin < _size)
13 {
14 _str[begin - len] = _str[begin];
15 ++begin;
16 }
17 }
18 _size -= len;
19 _str[_size] = '\0';
20
21 return *this;
22 }

Twenty-four : iterator

Divided into non const Version and const edition

Not const edition

take char*  Rename it to  iterator

The iterator is  begin end

begin Returns the first

Go straight back to _str that will do

end Return tail

Add valid characters to the return header

 1                 typedef char* iterator;
2
3 iterator begin()
4 {
5 return _str;
6 }
7
8 iterator end()
9 {
10 return _str + _size;
11 }

const edition

speak char*  Rename it to  const_iterator

I just have to add const that will do

 1 typedef const char* const_iterator;
2 const_iterator begin() const// To provide const And non const Two versions
3 {
4 return _str;
5 }
6
7 const_iterator end() const
8 {
9 return _str + _size;
10 }

twenty-five : Stream insertion 、 Stream extraction

Stream extraction

Because you need to match , So it needs to be implemented outside the class

1 Because continuous extraction should be supported   So want to use ostream& As return type

2 Use return for  Print with this

3 return

1     ostream& operator<<(ostream& out, const string& s)
2 {
3 for (auto k : s)
4 {
5 out << k;
6 }
7 return out;
8 }

Stream insertion

Because you need to match , So it needs to be implemented outside the class

1 Create a character

2 Characters are used to extract   Because characters cannot be separated by spaces   To prevent conflict   So we need get  In exchange for behavioral segmentation

3 establish buff Array   Initialize to \0

4 Define index  i

5 When not with spaces   When changing behavior conditions

6 Put the characters into the array respectively

7 When the array is full

8 Say that the array is added as a string s

9 Let's talk about it again buff All overloads are \0

10 Index overload is 0

11 After the end of the cycle , Continue inserting into ch

12 When there are less than full characters left , Add to s Inside

13 return

    istream& operator>>(istream& in, string& s)
{
char ch;
ch = in.get();
char buff[128] = { '\0' };
size_t i = 0;
while (ch != ' ' && ch != '\n')
{
buff[i++] = ch;
if (i == 127)
{
s += buff;
memset(buff, '\0', 128);
i = 0;
}
ch = in.get();
}
s += buff;
return in;
}

Here we add two easy-to-use functions

to_string  Let's talk about the conversion from integer to string

1     int i = 123456;
2 string s1 = to_string(i);

stoi  Let's talk about string conversion to integer

1     int n = stoi(s1);

This is the whole content of this article , If it helps you , Hope to get your praise !

string More articles on the underlying implementation of

  1. Assembly snooping Swift String The bottom of the

    String( character string ), Is a very important member of all programming languages , So it's worth studying deeply . as everyone knows , The essence of string is character sequence , It consists of several characters . Like strings "iOS" from 'i'.'O'.'S ...

  2. C++ And string The bottom layer of is really used char Array ?

    One . introduction There's a problem : Use encryption library to encrypt data , Get the ciphertext , Use string Save and transfer , Then it can be decrypted correctly , But use string.c_str() If parameters are passed, plaintext cannot be decrypted correctly . as a result of : In ciphertext ...

  3. 【1】String,StringBuffer,StringBuillder Research on the underlying structure of

    One :StringBuffer The bottom of the (1) Thread safe string operation class (2) adopt synchronized Keyword declaration synchronization method , Ensure data security in multithreaded environment @Override public synchroniz ...

  4. String,StringBuffer,StringBuillder Bottom structure of

    One :StringBuffer The bottom of the (1) Thread safe string operation class (2) adopt synchronized Keyword declaration synchronization method , Ensure data security in multithreaded environment public synchronized StringB ...

  5. Redis Of the underlying data structure String

    Article reference :<Redis Design and implementation > Huang Jianhong Redis Of string The underlying type uses SDS( Dynamic string ) Realized , The specific data structure is as follows : struct sdshdr { int len ...

  6. String、StringBuffer、StringBuilder Source code analysis

    Use decompilation to see "+" The process of 1 public class Test 2 { 3 public static void main(String[] args) 4 { 5 int ...

  7. Java Basic knowledge summary ( Four 、String)

    Four .String public final class String extends Object implements Serializable, Comparable<String>, ...

  8. JDK6 And JDK7 in String class subString() Differences in methods

    1.subString() The role of methods subString(int beginIndex, int endIndex) Method returns with beginIndex Start to endIndex-1 End of a call string ...

  9. golang string and []byte Comparison of

    golang string and []byte Comparison of Why? string and []byte Type conversion comes at a price ? Why do built-in functions copy There will be a special situation copy(dst []byte, src string) in ...

  10. java Basics ( 6、 ... and )-----String In depth analysis of the nature of

    This article will explain String Some properties of . One .String The immutability of For beginners , It's easy to mistake String Objects can be changed , especially + When linking , The object seems to have really changed . However ,String Once an object is created, it cannot be modified ...

Random recommendation

  1. I was a tutor a few years ago C course ( The fourth is about the pointer and the tower of Hanoi )

    C A treasure of language learning (4) The pointer : It can effectively represent complex data structures , Can dynamically allocate dynamic space , Easy to use string , Use arrays effectively , Can handle memory unit directly If you don't master the pointer, you don't master it C The essence of language Address : The system allocates an inner for each variable ...

  2. WP8.1 Slide on the side Item

    design sketch I think ios There are many similar ones on Android Item The effect of ,UWP There is Microsoft's official library on , There is a similar effect , it seems WP8.1 No! , By the way, my program also needs , I just copied one . The specific idea is : Touch control GRId stay CA ...

  3. GPS Latitude and longitude are converted into XY coordinate

    ;        }

  4. 【BZOJ 2321】 [BeiJing2011 training ] Starcraft

    Description Magic Land It's been centuries since I was born …… Now? , People talk about a legend : Once upon a time , Their ancestors came to an island in the East , It's just another world . Good at analysis and construction Magic Lan ...

  5. css3 Wait box

    The first 1 Kind of effect : <div class="loading"> <span></span> <span></span> < ...

  6. PHP Relational query

    article Article table : aid title content uid 1 article 1 article 1 Text content ... 1 2 article 2 article 2 Text content ... 1 3 article 3 article 3 Text content ... 2 4 article 4 article 4 ...

  7. BZOJ4076 : [Wf2014]Maze Reduction

    set up $f[i][j][k]$ From the room $j$ Of the $k$ No more than $i$ What's going on . about $0$ What's going on , It can be expressed in degrees of each room . Otherwise, you can walk around the room , Put all the situations in order hash To express . most ...

  8. luogu4932 browser ( Demolition )

    analysis 1 The parity of the number of : p. xor p. = accidentally xor accidentally = accidentally p. xor accidentally = p. So as long as statistics 1 The number of is the number of odd numbers and Is an even number Just ride together Direct use bitset To do it , Although the constant is very small / Random data can pass , But complexity ...

  9. How to Pronounce the 50 States

    How to Pronounce the 50 States (1/4) Share Tweet Share Tagged With: Places The US state names can be ...

  10. React Native Fill a hole

    React Native Fill a hole About RN Layout It is divided into main axis and cross axis , The main axis can be horizontal or vertical , The cross axis also corresponds to . The spindle is vertical by default . If you want to change it, use flexdirection Spindle alignment :justif ...