当前位置：网站首页>IMDB practice of emotion classification (simplernn, LSTM, Gru)

IMDB practice of emotion classification (simplernn, LSTM, Gru)

2022-07-01 07:43:00 【Programming bear】

Use the classic IMDB Movie review data set to complete the emotional classification task . IMDB The movie review dataset contains 50000 User comments , The label of evaluation is divided into negative and positive , among IMDB The rating <5 The user comments of are marked as 0, Negative ; IMDB evaluation >=7 The user comments of are marked as 1, Positive . 25000 Reviews for training set ,25,000 Bar for test set

One 、 Data set loading and data set preprocessing

#  load IMDB Data sets , Data is digitally encoded , A number represents a word 
(x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words)
print(x_train.shape, len(x_train[0]), y_train.shape)  # (25000,) 218 (25000,)
print(x_test.shape, len(x_test[0]), y_test.shape)  # (25000,) 68 (25000,)

#  Cut and fill sentences , Make the length equal to max_review_len, Keep the last part of this long sentence , Fill the short sentence in front 
x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len)
x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len)

#  Build data set , Break up , Batch , And losing the last one is not enough batches Of batch
db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
db_train = db_train.shuffle(1000).batch(batches, drop_remainder=True)
db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
db_test = db_test.batch(batches, drop_remainder=True)
print('db_train:', db_train)  # db_train: <BatchDataset shapes: ((128, 80), (128,)), types: (tf.int32, tf.int64)>

adopt Keras Data set provided datasets It can be loaded from the Internet IMDB Data sets , The data set consists of 50000 User comments , Half for training , The other half is for testing .x_train and x_test It's a length of 25,000 One dimensional array of , Each element of the array is an indefinite length List, Save each sentence with numeric code , For example, the first sentence of the training set has 218 Word , The first sentence of the test set has 68 Word , Each sentence contains the beginning of the sentence ID.

For sentences with uneven length , Set a threshold , For sentences longer than this , Choose to truncate some words , You can choose to truncate the first word of a sentence , You can also cut off the words at the end of the sentence ; For sentences less than this length , You can choose to fill in at the beginning or end of a sentence , The sentence truncation function can be achieved by keras.preprocessing.sequence.pad_sequences() function Realization

Cut or fill to the same length , adopt Dataset Class wrapped into a dataset object , And add common data set processing flow , For example, data set batching , Break up , When the last batch of data sets does not meet a batch Throw it away .

Two 、 Network model building

Custom network model class MyRNN: Embedding layer --> Two RNN layer --> Classification layer network

class MyRNN(keras.Model):
    def __init__(self, units):
        super(MyRNN, self).__init__()
        #  Word vector coding  [b,80] ==> [b,80,100]
        # embedding_len： Length of word vector ,total_words： The number of words  max_review_len： Enter the sentence length 
        self.embedding = layers.Embedding(total_words, embedding_len, input_length=max_review_len)
        #  structure 2 individual Cell [b,80,100] => [b,64]
        self.rnn = Sequential([
            layers.SimpleRNN(units, dropout=0.5, return_sequences=True),
            layers.SimpleRNN(units, dropout=0.5)
        ])
        #  Build a classification network , Is used to Cell The output characteristics of ,2 classification 
        # [b,64]=> [b,1]
        self.outlayer = Sequential([
            layers.Dense(units),
            layers.Dropout(rate=0.5),
            layers.ReLU(),
            layers.Dense(1)
        ])

    #  Forward calculation 
    def call(self, inputs, training=None):
        x = inputs  # [b,80]
        # embedding: [b,80] ==> [b,80, 100]
        x = self.embedding(x)
        # rnn cell compute: [b, 80,100] => [b,64]
        out1 = self.rnn(x)
        #  The last output of the end layer is used as the input of the classification network : [b, 64] => [b, 1]
        x = self.outlayer(out1, training)
        # p(y is pos|x)
        prob = tf.sigmoid(x)
        return prob

Embedding Layer action ： Encode a word into a word vector , It accepts numerically encoded word numbers , It's trainable

RNN layer ： Each layer requires the state output of the previous layer on each timestamp , So except for the last layer , be-all RNN Each layer needs to return the status output above each timestamp , By setting return_sequences=True To achieve , Used as input for the next level ,dropout Used to optimize network performance ： Reduce the connection between layers

Classification network completed 2 Classification task , Therefore, the output node is set to 1. The input sequence passes Embedding Layer completes word vector coding , Loop through two RNN layer , Extracting semantic features , Take the state vector output of the last timestamp of the last layer and send it to the classification network , after Sigmoid The output probability is obtained after activating the function
adopt Cell The way ：

class MyRNN(keras.Model):
    def __init__(self, units):
        super(MyRNN, self).__init__()
        # [b, 64], structure Cell Initialize the state vector , Reuse 
        self.state0 = [tf.zeros([batches, units])]
        self.state1 = [tf.zeros([batches, units])]
        #  Word vector coding  [b,80] ==> [b,80,100]
        self.embedding = layers.Embedding(total_words, embedding_len, input_length=max_review_len)
        #  structure 2 individual Cell
        self.rnn_cell0 = layers.SimpleRNNCell(units, dropout=0.5)  # dropout  Reduce connections 
        self.rnn_cell1 = layers.SimpleRNNCell(units, dropout=0.5)
        #  Build a classification network , Is used to Cell The output characteristics of ,2 classification 
        # [b,80,100] => [b,64]=> [b,1]
        self.outlayer = Sequential([
            layers.Dense(units),
            layers.Dropout(rate=0.5),
            layers.ReLU(),
            layers.Dense(1)
        ])

    #  Forward calculation 
    def call(self, inputs, training=None):
        x = inputs  # [b,80]
        # embedding: [b,80] ==> [b,80, 100]
        x = self.embedding(x)
        # rnn cell compute: [b, 80,100] => [b,64]
        state0 = self.state0
        state1 = self.state1
        for word in tf.unstack(x, axis=1):  # word: [b,100]  Expand from the time dimension 
            out0, state0 = self.rnn_cell0(word, state0, training)
            out1, state1 = self.rnn_cell1(out0, state1, training)
        #  The last output of the end layer is used as the input of the classification network : [b, 64] => [b, 1]
        x = self.outlayer(out1, training)
        # p(y is pos|x)
        prob = tf.sigmoid(x)
        return prob

You need to implement forward calculation by yourself , And maintain each RNN The initial state vector of the layer , Everything else is the same

3、 ... and 、 Network assembly

model = MyRNN(units)
    #  Assemble the optimizer , Learning rate , Measurer 
    model.compile(optimizer=optimizers.Adam(1e-3),
                  loss=losses.BinaryCrossentropy(),
                  metrics=['accuracy'])

After building the network model , You need to specify the optimizer object used by the network 、 Loss function type , Setting of evaluation indicators, etc , This step is called assembly For simplicity , Use here Keras Of Compile&Fit Way to train the network , Set the optimizer to Adam optimal

Four 、 Training and verification

   #  Training   And   verification   validation_data： Validation data 
    model.fit(db_train, epochs=epochs, validation_data=db_test)
    #  test 
    model.evaluate(db_test)

Use here Keras Of Compile&Fit Way to train the network , Set up the optimizer , Learning rate , Error function test index ( Adopt accuracy ), Use between fit() Feed data sets and test sets to train

result ： Training nearly 30 individual epoch

5、 ... and 、LSTM Model

You only need to modify the network model ： Modify the network layer type

 #  structure rnn
        self.rnn = Sequential([
            layers.LSTM(units, dropout=0.5, return_sequences=True),
            layers.LSTM(units, dropout=0.5)
        ])

result ：LSTM The obvious effect is better than RNN A little bit better.

6、 ... and 、GRU Model

You only need to modify the network model ： Modify the network layer type

  #  structure rnn
        self.rnn = Sequential([
            layers.GRU(units, dropout=0.5, return_sequences=True),
            layers.GRU(units, dropout=0.5)
        ])

result ： Also slightly better than SampleRNN good ：

7、 ... and 、 The whole program

# -*- codeing = utf-8 -*-
# @Time : 10:20
# @Author:Paranipd
# @File : imdb_rnn_cell.py
# @Software:PyCharm

import os
import tensorflow as tf
import numpy as np
from tensorflow import keras
from tensorflow.keras import Sequential, Model, layers, metrics, optimizers, losses

tf.random.set_seed(22)
np.random.seed(22)
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
assert tf.__version__.startswith('2')

batches = 128  #  Batch size 
total_words = 10000  #  Vocabulary size N_vocad
max_review_len = 80  #  The maximum length of a sentence s, Parts greater than will be truncated , Less than will fill 
embedding_len = 100  #  Word vector feature length 

#  load IMDB Data sets , Data is digitally encoded , A number represents a word 
(x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words)
print(x_train.shape, len(x_train[0]), y_train.shape)  # (25000,) 218 (25000,)
print(x_test.shape, len(x_test[0]), y_test.shape)  # (25000,) 68 (25000,)


#  Cut and fill sentences , Make the length equal to max_review_len, Keep the last part of this long sentence , Fill the short sentence in front 
x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len)
x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len)

#  Build data set , Break up , Batch , And losing the last one is not enough batches Of batch
db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
db_train = db_train.shuffle(1000).batch(batches, drop_remainder=True)
db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
db_test = db_test.batch(batches, drop_remainder=True)
print('db_train:', db_train)  # db_train: <BatchDataset shapes: ((128, 80), (128,)), types: (tf.int32, tf.int64)>


class MyRNN(keras.Model):
    def __init__(self, units):
        super(MyRNN, self).__init__()
        #  Word vector coding  [b,80] ==> [b,80,100]
        # embedding_len： Length of word vector ,total_words： The number of words  max_review_len： Enter the sentence length 
        self.embedding = layers.Embedding(total_words, embedding_len, input_length=max_review_len)
        #  structure 2 individual Cell [b,80,100] => [b,64]
        self.rnn = Sequential([
            layers.SimpleRNN(units, dropout=0.5, return_sequences=True),
            layers.SimpleRNN(units, dropout=0.5)
        ])
        #  Build a classification network , Is used to Cell The output characteristics of ,2 classification 
        # [b,64]=> [b,1]
        self.outlayer = Sequential([
            layers.Dense(units),
            layers.Dropout(rate=0.5),
            layers.ReLU(),
            layers.Dense(1)
        ])

    #  Forward calculation 
    def call(self, inputs, training=None):
        x = inputs  # [b,80]
        # embedding: [b,80] ==> [b,80, 100]
        x = self.embedding(x)
        # rnn cell compute: [b, 80,100] => [b,64]
        out1 = self.rnn(x)
        #  The last output of the end layer is used as the input of the classification network : [b, 64] => [b, 1]
        x = self.outlayer(out1, training)
        # p(y is pos|x)
        prob = tf.sigmoid(x)
        return prob


def main():
    units = 64  # rnn State vector length 
    epochs = 50
    model = MyRNN(units)
    #  Assemble the optimizer , Learning rate , Measurer 
    model.compile(optimizer=optimizers.Adam(1e-3),
                  loss=losses.BinaryCrossentropy(),
                  metrics=['accuracy'])

    #  Training   And   verification   validation_data： Validation data 
    model.fit(db_train, epochs=epochs, validation_data=db_test)
    #  test 
    model.evaluate(db_test)


if __name__ == '__main__':
    main()

原网站

版权声明
本文为[Programming bear]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/02/202202160207243177.html