当前位置:网站首页>Simulate the implementation of string class

Simulate the implementation of string class

2022-07-07 19:54:00 Yuucho

1. Implement a simple one string

1.1 ordinary string class

We step by step , Don't consider string Add, delete, check and modify , Only consider string The deep and shallow copy problem of .

//string.h
#pragma once
#include <string.h>

namespace Yuucho
{
    
    class string
    {
    
        public:
        // iterator ( Embedded type ), Name it begin、end( Grammatical rules )
        // As long as there is an iterator, you can use the scope for
        // Because the bottom layer is the scope for Replace with iterator 
        typedef char* iterator;
        typedef const char* const_iterator;
        
        // Take the first valid data as begin
        iterator begin()
        {
    
            return _str;
        }
        
        // Take the last one of the valid data as end
        iterator end()
        {
    
            return _str + _size;
        }
        
        // Provide const edition , The return value cannot be modified 
        //const Object cannot call non const The iterator 
        const_iterator begin() const
        {
    
            return _str;
        }
        
        const_iterator end() const
        {
    
            return _str + _size;
        }
        
        // The default value , Empty string only '\0'
        string(const char* str = "")
            : _size(strlen(str))
            , _capacity(_size)
        {
    
            // For ever '\0' One more 
            _str = new char[_capacity + 1];
            strcpy(_str,str);
        }
        
        ~string()
        {
    
            if(_str)
            {
    
                delete[] _str;
                _str = nullptr;
                _size = _capacity = 0;
            }
        }
        
        // No overloaded stream extraction 、 Stream insertion , use c_str To print strings 
        // Common object 、const Objects can be called 
        const char* c_str() const
        {
    
            return _str;
        }
        
        // Reference return is convenient for modification , Reduce copies 
        char& operator[](size_t pos)
        {
    
            assert(pos < _size);
            return _str[pos];
        }
        
        // by const Object preparation , return const References to , Do not modify 
        const char& operator[](size_t pos) const
        {
    
            assert(pos < _size);
            return _str[pos];
        }
        
        size_t size() const
        {
    
            return _size;
        }
        
        size_t capacity() const
        {
    
            return _capacity;
        }
		
        void clear()
        {
    
            _str[0] = '\0';
            _size = 0;
        }
        private:
        char* _str;
        size_t _size;	  // Number of valid characters 
        size_t _capacity;  // The actual space to store valid characters 
        const static size_t npos;
        //const static size_t npos = -1;( Special treatment of compiler )
    };
    const size_t string::npos = -1;
}

test :

//test.h
#include <iostream>
using namespace std;

#include "string.h"

void test_string1()
{
    
    Yuucho::string s1("hello world");
    //s1.operator[](0) = 'x';
    s1[0] = 'x';
    cout << s1.c_str() << endl;
    
    for(size_t i = 0; i < s1.size(); ++i)
    {
    
        cout << s1[i] << " ";
    }
    cout << endl;
}

int main()
{
    
    test_string1();
    return 0;
}

 Insert picture description here

'x’ Assigned to the return value of this function call expression :

 Insert picture description here

1.2 Shallow copy problem

What we are currently writing string Class also faces a major problem . If you don't actively write copy constructors and copy assignment functions , The compiler will use “ Copy by member ( Shallow copy )” Automatically generate the corresponding default function . If the class contains pointer members or reference members , Then these two default functions may contain errors .

void test_string2()
{
    
    Yuucho::string s1("hello world");
    Yuucho::string s2(s1);
    cout << s1.c_str() << endl;
    cout << s2.c_str() << endl; 
}

Take two objects s1、s2 For example . hypothesis s1._str The content is "hello world", Will now s1 Copy structure to s2, Default copy constructor " Copy by member " Will cause 2 A mistake :

(1)s1、s2 The two pointer members of point to the same block of memory ( Space on the heap ),s1 and s2 Any change in one side will affect the other .

(2) When an object is destructed ,_str It has been deconstructed twice . The reason why it cannot be destructed twice is :

After a piece of memory is returned to the operating system , The system may allocate this memory to other places , If you destruct at this time, you will free the memory in other places , This is not allowed . The same goes for copy assignment . This process is shown in the figure below :

 Insert picture description here

 Insert picture description here

notes : Copy constructor and copy assignment function are very confusing , It often leads to wrong writing 、 Misuse . The copy constructor is called when an object is created and initialized with another existing object , The assignment function can only assign an object to another existing object , Make the existing object have the same state as the source object .

string a("hello");
string b("world");
string c = a;	// Call the copy constructor , It's best to write it. c(a);
c = b;			// Call the assignment function 

In this case, the 3 The style of this sentence is poor , It should be rewritten as string c(a), To distinguish it from the 4 statement .

2.string Deep copy

2.1 copy constructor

string A simple implementation of the copy constructor of class :

// Copy structures must be passed by reference , Otherwise, infinite recursion will be caused 
//s2(s1)
string(const string& s)
    :_size(strlen(s._str))
    , _capacity(_size)
{
    
    _str = new char[_capacity + 1];
    strcpy(_str,s._str);
}

 Insert picture description here

test test_string2:

 Insert picture description here

First of all, deconstruct s2, Re structure s1:

 Insert picture description here
 Insert picture description here

notes :string The difference between the class copy constructor and the default constructor is : At the function entrance, there is no need to nullptr Compare , This is because “ quote ” It can't be nullptr, and “ The pointer ” It can be for nullptr.

2.2 Copy assignment function

string A simple implementation of the assignment function of class :

string& operator=(const string& s)
{
    
    //(1) Check self assignment 
    if(this != &s)// here & Is the address 
    {
    
        //(2) Allocate new memory resources , And copy the content 
        char *temp = new char[s._capacity + 1];
        strcpy(temp,s._str);//'\0' Also copied 
        //(3) Release the original memory resources 
        delete[] _str;
        _str = temp;
        _size = s._size;
        _capacity = s._capacity;
    }
    //(4) Returns a reference to this object 
    return *this;
}

notes :

If the first (2) First, release the original memory resources , Then if the subsequent memory reallocation operation fails , It's miserable. ! contrary , If you first allocate memory to a temporary pointer to save , In case of allocation failure ( Throw an exception ) It doesn't change this object , This is to achieve exceptional security !

If the first (3) Do not release memory , There will be no chance in the future , Because it will cause memory leakage .

(4) Returns a reference to this object , The purpose is to realize such as a = b = c; Such a chained expression .

test :

void test_string3()
{
    
    Yuucho::string s1("hello world");
    Yuucho::string s2(s1);
    Yuucho::string s3("111111111");
    s1 = s3;
    cout << s1.c_str() << endl;
    cout << s2.c_str() << endl;
    cout << s3.c_str() << endl;   
}

 Insert picture description here

3. Addition, deletion, modification and overloading of relational operators

3.1 reserve

reserve It's for expansion , It will request that the string capacity be changed according to the planned size to a maximum of n The length of characters .reserve You can also open the space in advance , prevent push_back Repeated capacity expansion .reserve It won't shrink , Because you gave n Than _capacity Small , It can do nothing .

void reserve(size_t n)
{
    
    if(n > _capacity)
    {
    
        char* tmp = new char[n + 1];
        strcpy(tmp, _str);
        delete[] _str;
        _str = tmp;
        _capacity = n;
    }
}

3.2 resize

resize Yes, it will not only change the space , It will also change _size And the contents of the string . It is often used to expand space + Initialize or delete some data , If n< Less than size, Before retention n Data .
 Insert picture description here

void resize(size_t n, char ch = '\0')
{
    
    if (n < _size)
	{
    
		_size = n;
		_str[_size] = '\0';
	}
	else
	{
    
		if (n > _capacity)
		{
    
			reserve(n);
		}

		for (size_t i = _size; i < n; ++i)
		{
    
			_str[i] = ch;
		}
		_size = n;
		_str[_size] = '\0';
	}	
}

3.3 push_back and operator+=

push_back Is to insert a character at the end , So let's be a little violent , If it's full, we'll expand it directly 2 times . Then copy the data . Release the old space again . Note the empty string (_capacity to 4).

void push_back(char ch)
{
    
    if(_size == _capacity)
    {
    
        reserve(_capacity == 0 ? 4 : _capacity * 2);
    }
    _str[_size] = ch;
    ++_size;
    _str[_size] = '\0';
}

// Realized after reuse insert
void push_back(char ch)
{
    
    insert(_size,ch);
}

string& operator+=(char ch)
{
    
    push_back(ch);
    return *this;
}

3.4 append and operator+=

append It is used to insert strings , Capacity expansion 2 Times may not be enough .

void append(const char* str)
{
    
    size_t len = _size + strlen(str);
    if(len > _capacity)
    {
    
        reserve(len);
    }
    strcpy(_str + _size, str);
    _size = len;
}

string& operator+=(const char*str)
{
    
    append(str);
    return *this;
}

Reuse insert:

void append(const char* str)
{
    
   insert(_size,str);
}

3.5 insert( Insert characters )

Insert a character anywhere , It is the same as the implementation of our learning data structure . The library will also return *this, We also achieve this . Insertion is also a modification of the original data .

It's the same as learning the sequence table , The following way of writing will cause problems when inserting the head . Because when the head is inserted end It will arrive -1,size_t end = -1; It's a big number .

 Insert picture description here

 Insert picture description here

So we take ’\0’ The latter one acts as end, Make sure end At the end of the loop , Stop at the subscript 0 The location of .

 Insert picture description here

string& insert(size_t pos, char ch)
		{
    
			assert(pos <= _size);
			if (_size == _capacity)
			{
    
				reserve(_capacity == 0 ? 4 : _capacity * 2);
			}
    
			size_t end = _size+1;
			while (end > pos)
			{
    
				_str[end] = _str[end-1];
				--end;
			}

			_str[pos] = ch;
			_size++;

			return *this;
		}

Tips : You have seen insert After the underlying implementation of , Can well understand why insert The efficiency is not high . So frequent head insertion is better than tail insertion and then reverse .

3.6 insert( Insert string )

 Insert picture description here

string& insert(size_t pos, const char* str)
		{
    
			assert(pos <= _size);
			size_t len = strlen(str);
			if (_size + len > _capacity)
			{
    
				reserve(_size + len);
			}

			//  Move back len A place 
			size_t end = _size + len;
			while (end > pos+len-1)
			{
    
				_str[end] = _str[end -len];
				--end;
			}

			strncpy(_str + pos, str, len);
			_size += len;

			return *this;
		}

Be careful : If while The judgment condition of the statement is written as end >= pos+len, In extreme conditions ( Empty string and head insert ) There will be problems .

as a result of end Go to the -1 It becomes a big number again , Program loop . Of course, it's easy to deal with it , use if Just judge the sentence .

if(len == 0)
{
    
    return *this;
}

3.7 erase

 Insert picture description here

string& earse(size_t pos, size_t len = npos)
{
    
	assert(pos < _size);

	if (len == npos || pos + len >= _size)
	{
    
		_str[pos] = '\0';
		_size = pos;
	}
	else
	{
    
		size_t begin = pos + len;
		while (begin <= _size)
		{
    
			_str[begin - len] = _str[begin];
			++begin;
		}

		_size -= len;
	}

	return *this;
}

3.8 find

size_t find(char ch, size_t pos = 0)
{
    
	for (; pos < _size; ++pos)
	{
    
		if (_str[pos] == ch)
		{
    
			return pos;
		}
	}

	return npos;
}

size_t find(const char* str, size_t pos = 0)
{
    
	const char* p = strstr(_str + pos, str);
	if (p == nullptr)
	{
    
		return npos;
	}
	else
	{
    
		return p - _str;
	}
}

3.9 Operator overloading

Such functions cannot be written as member functions , The reason is that member functions implicitly contain this The pointer .

// Stream extraction 、 Stream inserts do not need to be written as friends without accessing private 
ostream& operator<<(ostream& out, const string& s)
{
    
	for (auto ch : s)
	{
    
		out << ch;
	}
    // Support continuous stream insertion 
	return out;
}

// utilize buff Reduce continuous expansion , To optimize 
istream& operator>>(istream& in, string& s)
{
    
    s.clear();
	char ch;
	ch = in.get();
	char buff[128] = {
    '\0'};
	size_t i = 0;
	while (ch != ' ' && ch != '\n')
	{
    
		buff[i++] = ch;
		if (i == 127)
		{
    
			s += buff;
			memset(buff, '\0', 128);
			i = 0;
		}

		ch = in.get();
	}

	s += buff;
    // Support continuous stream extraction 
	return in;
}

bool operator<(const string& s1, const string& s2)
{
    
	return strcmp(s1.c_str(), s2.c_str()) < 0;
}

bool operator==(const string& s1, const string& s2)
{
    
	return strcmp(s1.c_str(), s2.c_str()) == 0;
}

bool operator<=(const string& s1, const string& s2)
{
    
	return s1 < s2 || s1 == s2; } bool operator>(const string& s1, const string& s2)
{
    
	return !(s1 <= s2);
}

bool operator>=(const string& s1, const string& s2)
{
    
	return !(s1 < s2);
}

bool operator!=(const string& s1, const string& s2)
{
    
	return !(s1 == s2);
}

4. string Modern writing of classes

In addition to the most basic implementation methods given above, copy constructor and copy assignment function ,string The copy constructor and copy assignment function of the class can also be implemented in this way .

 Insert picture description here

void swap(string& s)
{
    
	std::swap(_str, s._str);
	std::swap(_size, s._size);
	std::swap(_capacity, s._capacity);
}

//_str No initialization is random value 
// In exchange for tmp after ,tmp If you want to destruct out of scope, you will make mistakes 
// So we have to deal with s2 To initialize 
//s2(s1)
string(const string& s)
	:_str(nullptr)
	, _size(0)
	, _capacity(0)
{
    
	string tmp(s._str);
	swap(tmp);
}

string& operator=(const string& s)
{
    
	if (this != &s)
	{
    
		string tmp(s._str);  // Call the copy constructor 
		swap(tmp);			 // The current object is exchanged with the temporary object , Abnormally safe !
	}

	return *this;
}

string& operator=(string s)  // Value passing will call the copy constructor 
{
    
	swap(s);				 // Exchange directly with temporary objects , Abnormally safe !
	return *this;
}

Copy assignment function is realized by calling copy constructor , You can ensure that the assignment operation is terminated immediately when the copy constructor throws an exception , Therefore, the lvalue object will not be modified .

原网站

版权声明
本文为[Yuucho]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/188/202207071736484361.html

随机推荐