当前位置:网站首页>[rust notes] 16 input and output (Part 1)

[rust notes] 16 input and output (Part 1)

2022-07-05 06:05:00 phial03

16 - Input and output

16.1 - Readers and writers

  • Rust Standard library features for input and output , It's through ReadBufRead and Write Special type , And the various types that implement them .

    • Realization Read The value of is Reader (reader), There is a way to read byte input .
    • Realization BufRead The value of is Buffer reader , Support Read All the ways , In addition, it supports the method of reading text lines .
    • Realization Write The value of is Writer (writer), Both support byte output , Also support UTF-8 Text output .
  • Common readers : Read byte

    • std::fs::File::open(filename): Used to open a file .
    • std::net::TcpStream: Used to receive data from the network .
    • std::io::stdin(): Used to read data from the standard input stream of the process .
    • std::io::Cursor<&[u8]> value : From the byte array of memory “ Read ” data .
  • Common writers : Write Bytes

    • std::fs::File::create(filename): Used to open a file .
    • std::net::TcpStream: Used to send data over the network .
    • std::io::stdout() and std::io::stderr(): Used to write data to the terminal .
    • std::io::Cursor<&mut [u8]>: Allow any modifiable byte slice to be written as a file .
    • Vec<u8>: It is also a writer , its write Method can append elements to a vector .
  • be based on std::io::Read and std::io::Write Generic code for special implementation , It can cover various input and output channels .

    //  From any reader , Copy all bytes to any writer 
    use std::io::{
          self, Read, Write, ErrorKind};
    
    const DEFAULT_BUF_SIZE: usize = 8 * 1024;
    
    pub fn copy<R: ?Sized, W: ?Sized>(reader: &mut R, writer: &mut W)
        -> io::Result<u64>
        where R: Read, W: Write
    {
          
        let mut buf = [0; DEFAULT_BUF_SIZE];
        let mut written = 0'
        loop {
          
            let len = match reader.ead(&mut buf) {
          
                Ok(0) => return Ok(written),
                Ok(len) => len,
                Err(ref e) if e.kind() == ErrorKind::Interrupted => continue,
                Err(e) => return Err(e),
            };
            writer.write_all(&buf[..len])?;
            written += len as u64;
        }
    }
    
    • std::io::copy() Is generic , Data can be transferred from File Copied to the TcpStream, from Stdin Copied to memory Vec<u8>.
  • Four commonly used std::io Special type of ReadBufReadWrite and Seek How to import :

    • Import a dedicated front-end module , They can be included directly :

      use std::io::prelude::*;
      
    • Import std::io Module itself

      use std::io::{
              self, Read, Write, ErrorKind};
      // self Can be io Life is `std::io` Module alias , such std::io::Result and std::io::Error, It can be simply written as io::Result and io::Error.
      

16.1.1 - Reader

  • std::io::Read Common reader methods , For reading data , They are based on the reader itself mut Reference as parameter :

    • reader.read(&mut buffer): Read some bytes from the data source , Then store it in the given buffer in .

      • buffer The type of the parameter is &mut [u8].

      • This method will read buffer.len() Bytes .

      • The type of return value is io::Result<u64>, This is Result<y64, io::Error> Alias for type .

        Read successful , Returns the u64 Number of read bytes of value , The number <= buffer.len().

        Read error ,.read() return Err(err), among err yes io::Error value ..kind() Method can return io::ErrorKind Error code for type .

    • reader.read_to_end(&mut byte_vec): Read out all remaining inputs from the reader , And added to byte_vec in .

      • byte_vec It's a Vec<u8> Type value .
      • This method returns io::Result<(usize)>, Indicates the number of bytes read .
      • This method has no limit on the amount of data added to the vector , Don't use it for untrusted data sources . have access to .take() Methods to improve security limits .
    • reader.read_to_string(&mut string): Read out all remaining inputs from the reader , And added to string in .

      • If the input stream is not valid UTF-8, Then this method will return ErrorKind::InvaliData error .
      • except UTF-8 Other character sets besides , It can be through open source encoding Package support .
    • read.read_exact(&mut buf): Read just enough data from the reader , Fill to the given buffer in .

      • Parameter type is &[u8].
      • If the reader is reading buf.len() Read the data before bytes , Then this method will return ErrorKind::UnexpectedEof error .
  • std::io::Read Common adapter methods , With reader (reader) As a parameter , Convert it to an iterator or a different reader :

    • reader.bytes()
      

      : The iterator that returns the bytes of the input stream .

      • The type of iterator item is io::Result<u8>, Every byte needs error checking .
      • This method will be called once for each byte reader.read(), It is inefficient for readers without buffer .
    • reader.chars(): The reader is UTF-8, And returns an iterator whose item is a character . invalid UTF-8 It can lead to InvalidData error .

    • reader1.chain(reader2): Return to a new reader , contain reader1 and reader2 All of the inputs .

    • reader.take(n): From and reader The same data source reads the input , But only read n byte , Return to a new reader .

  • Both reader and write will be realized Drop Special type , It will close automatically after the operation is completed .

16.1.2 - Buffer reader

  • buffer : Allocate a block of memory to the reader and writer as a buffer , Temporarily save the input and output data . Buffering can reduce system calls .

  • The buffer reader implements Read and BufRead Two special types .

  • BufRead
    

    Typical common reader methods :

    • reader.read_line(&mut line)
      

      : Read a line of text and append to

      line
      

      .

      • line It's a String Type value .
      • Line break at end of line '\n' or "\r\n" It will also be included in line in .
      • The return value is io::Result<usize>, Represents the number of bytes read , Including line terminators .
      • If the read is at the end of the input , be line unchanged , And return to Ok(0).
    • reader.lines()
      

      : Returns the iterator of the input line .

      • The iteration item type is io::Result<String>.
      • Line breaks are not included in the string .
    • reader.read_until(stop_byte, &mut byte_vec) and reader.split(stop_byte): And .read_line() and .lines() similar . But in bytes , produce Vec<u8> value .stop_byte Indicates the delimiter .

    • .fill_buf() and .consume(n): It can be used to directly access the buffer inside the reader .

16.1.3 - Read text lines

  • Unix Of grep Command analysis :

    • Search for multiline text , And used in combination with pipes , To find the specified writer .
    use std::io;
    use std::io::prelude:: *;
    fn grep(target: &str) -> io::Result<()> {
          
        let stdin = io::stdin();
        for line_result in stdin.lock().lines() {
          
            let line = line_result?;
            if line.contains(target) {
          
                println!("{}", line);
            }
        }
        Ok(())
    }
    
    • Further expansion , Add the function of searching files on disk , Improved to generic function :

      fn grep<R>(target: &str, reader: R) -> io::Result<()> where R: BufRead {
              
          for line_result in reader.lines() {
              
              let ine = line_result?;
              if line.contains(target) {
              
                  println!("{}", line);
              }
          }
          Ok(())
      }
      
    • adopt StdinLock Or buffer File call .

      let stdin = io::stdin()
      grep(&target, stdin.lock())?;
      
      let f = File::open(file)?;
      grep(&target, BufReader::new(f))
      
  • File and BufReader There are two different library features , Because sometimes you need unbuffered files , Sometimes you need non file buffers .

    • File No automatic buffering , But through BufReader::new(reader) establish .
    • If you want to set the arrival of the buffer , You can use BufReader::with_capacity(size, reader).
  • Unix Of grep Command complete program :

    // grep:  Search for stdin Or lines in some files that match the specified string 
    use std::error::Error;
    use std::io::{
          self, BufReader};
    use std::io::prelude:: *;
    use std::fs::File;
    use std::path::PathBuf;
    
    fn grep<R>(target: &str, reader: R) -> io::Result<()> where R: BufRead {
          
        for line_result in reader.lines() {
          
            let line = line_result?;
            if line.contains(target) {
          
                println!("{}", line);
            }
        }
        Ok(())
    }
    
    fn grep_main() -> Result<(), Box<Error>> {
          
        //  Get command line parameters . The first parameter is the string to search , The other parameters are file names 
        let mut args = std::env::args().skip(1);
        let target = match args.next() {
          
            Some(s) => s,
            None = Err("usage: grep PATTERN FILE...")?
        };
        let files: Vec<PathBuf> = args.map(PathBuf::from).collect();
    
        if files.is_empty() {
          
            let stdin = io::stdin();
            grep(&target, stdin.local())?;
        } else {
          
            for file in files {
          
                let f = File::open(file)?;
                grep(&target, BufReader::new(f))?;
            }
        }
        Ok(())
    }
    
    fn main() {
          
        let result = grep_main();
        if let Err(err) = result {
          
            let _ = writelen!(io::stderr(), "{}, err");
        }
    }
    

16.1.4 - Collection line

  • The reader method will return Result Value iterator .
  • .collect() You can collect rows .
let lines = reader.lines().collect::<io::Result<Vec<String>>>()?;
// io::Result<Vec<String>> It's a collection type , therefore .collect() Method can create and fill in values of this type .
  • The standard library is Result Realized FromIterator Special type :
impl<T, E, C> FromIterator<Result<T, E>> for Result<C, E> where C: FromIterator<T> {
    
    ...
}
  • If the type can be T The item , Collected type Cwhere C: FromIterator<T>) The collection of , Then you can type Result<T, E> The item collection of is of type Result<C, E>(FromIterator<Result<T, E>> for Result<C, E>).

16.1.5 - Writer

  • Output to the standard output stream , have access to println!() and print!() macro . They will only be surprised when they fail to write .

  • Output to writer , You can use writeln!() and write!() macro .

    • They contain two parameters , The first parameter is the writer .
    • Their return value is Result. When use , It is suggested that ? The end of the operator , Used to handle errors .
  • Write Special method :

    • writer.write(&buf): Slice buf Some bytes in are written to the underlying stream . return io::Result<usize>, Include the number of bytes written when successful , It may be less than buf.len(). This method has low safety limits , Try not to use .
    • writer.write_all(&buf): Slice buf All bytes in are written , return Result<()>.
    • writer.flush(): Write all buffered data to the underlying stream , return Result<()>.
  • Similar to readers , The writer will also close automatically when it is cleared . All remaining buffered data will be written to the underlying writer , An error occurred during writing , Errors will be ignored . To ensure that the application can find all output errors , Should be cleared before , Manual use .flush() Method to clean up the buffer writer .

  • BufWriter::new(writer) Any writer can be buffered .BufReader::new(reader) A buffer can be added to any reader .

    let file = File::create("tmp.txt")?;
    let writer = BufWriter::new(file);
    
  • To set the buffer size of the reader , have access to BufWriter::with_capacity(size, writer).

16.1.6 - file

  • How to open a file :

    • File::open(filename): Open an existing file for reading . Return to one io::Result<File>, If the file does not exist, an error will be returned .

    • File::create(finename): Create a new file for writing . If the file with the specified name already exists , Then the document will be abridged .

    • Use OpenOptions Specify the behavior of opening files

      use std::fs::OpenOptions;
      
      let log = OpenOptions::new()
          .append(true)         //  If the file exists , Then add content at the end 
          .open("server.lgo")?;
      
      let file = OpenOptions::new()
          .write(true)
          .create_new(true)     //  If the file exists, it fails 
          .open("new_file.txt")?;
      

      .append().write().create_new() And so on can be called by concatenation , Because they all return self. This mode of method concatenation calls , stay Rust The species is called Builder (builder).

  • File Type in file system module std::fs in .

  • File After opening , It can be used like other readers or writers . You can add buffers as needed .File It will also turn off automatically when it is cleared .

16.1.7 - Search for

File Realized Seek Special type : Support jump reading in the file , Instead of reading or writing from beginning to end at once .

pub trait Seek {
    
  fn seek(&mut self, pos: SeekFrom) -> io::Result<u64>;
}

pub enum SeekFrom {
    
  Start(u64),
  End(i64),
  Current(i64)
}
  • file.seek(SeekFrom::Start(0)) It means to jump to the starting position .
  • file.seek(SeekFrom::Current(-8)) Back off 8 byte .
  • Whether it's mechanical hard disk or SSD Solid state disk , A search can only read a few megabytes of data .

16.1.8 - Other reader and writer types

  • io::stdin(): Reader that returns the standard input stream , The type is io::Stdin.

    • It is shared by all threads , Each read is designed to obtain and release mutexes .

    • Stdin Of .lock() Method , Used to obtain mutex , And return a io::StdinLock Buffer reader , The mutex will be held before it is cleared , Avoid mutex overhead .

    • and io::stdin().lock() Cannot apply to mutexes , Because it will save right Stdin The value of the reference , requirement Stdin Values must be stored in a place with a long lifetime . But it can be used in collecting rows .

      let stdin = io::stdin();
      let lines = stdin.lock().lines();
      
  • io::stdout(): The writer that returns the standard output stream . Have mutexes and .lock() Method .

  • io::stderr(): The writer that returns the standard error stream . Have mutexes and .lock() Method .

  • Vec<u8> Realized Write.

    • Can write Vec<u8>, Expand the vector with new data .
    • String It didn't come true Write. To use Write Build string , First you need to write a Vec<u8> in , And then use String::from_utf8(vec) Convert the limit to a string .
  • Cursor::new(buf): Create a new Cursor, It's a buf Buffer reader reading data in .

    • Used to create read String Reader .
    • Parameters buf It can be implementation AsRef<[u8]> Any type of , Therefore, it can also be transferred &[u8]&str or Vec<u8>.
    • Cursor Internal only buf Itself and an integer . This integer is used to indicate that buf Offset in , The initial value is 0.
    • Cursor Realized ReadBufRead and Seek Special type .
    • If buf The type is &mut [u8] or Vec<u8>, Then support Write Special type .Cursor<&mut [u8]> and Cursor<Vec<u8>> It has also been realized. std::io::prelude All of the 4 A special type .
  • std::net::TcpStream: Express TCP network connections .

    • It's a reader , It is also a writer , To support the TCP Two-way communication .
    • TcpStream::connect(("hostname", PORT)) Static methods : Try to connect to the server , return io::Result<TcpStream>.
  • std::process::Command: Support the creation of a child process , Import data into its standard input .

    use std::process::{
          Command, Stdio};
    
    let mut child = Command::new("grep")
        .arg("-e")
        .arg("a.*e.*i.*o.*u")
        .stdin(Stdio::piped())
        .spawn()?;
    
    let mut to_child = child.stdin.take().unwrap();
    
    for word in my_words {
          
      writelen!(to_child, "{}", word)?;
    }
    drop(to_child); //  close grep Of stdin
    child.wait()?;
    
    • child.stdin The type is Option<std::process::ChildStdin>.
    • Command Also have .stdout() and .stderr() Method .
  • std::io modular : Some functions are provided , To return simple readers and writers .

    • io::sink(): No operation writer . All write methods return Ok, But the data will be discarded .
    • io::empty(): No operation reader . Reading is always successful , But the return input terminates .
    • io::repeat(byte): The returned reader will repeatedly give the specified bytes .

16.1.9 - binary data 、 Compression and serialization —— Open source package std::io Expand

  • byteorder package : Provides ReadBytesExt and WriteBytesExt Special type , Provide methods for readers and writers of all binary inputs and outputs .

  • flate2 package : For reading 、 Write gzip Compressed data provides additional adapter methods .

  • serde package : For serialization and deserialization , Can achieve Rust Conversion between data structures and bytes .

    • serde::Serialize Special serialize Method : Serve all types that support serialization , Such as a string 、 character 、 Tuples 、 Vector and HashMap.

    • serde It also supports derived features , To serve custom types :

      #[derive(Serialize, Deserialize)]
      struct Player {
              
        location: String,
        items: Vec<String>,
        health: u32
      }
      

See 《Rust Programming 》( Jim - Brandy 、 Jason, - By orendov , Translated by lisongfeng ) Chapter 18
Original address

原网站

版权声明
本文为[phial03]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/186/202207050549561343.html