当前位置:网站首页>IMDB practice of emotion classification (simplernn, LSTM, Gru)
IMDB practice of emotion classification (simplernn, LSTM, Gru)
2022-07-01 07:43:00 【Programming bear】
Use the classic IMDB Movie review data set to complete the emotional classification task . IMDB The movie review dataset contains 50000 User comments , The label of evaluation is divided into negative and positive , among IMDB The rating <5 The user comments of are marked as 0, Negative ; IMDB evaluation >=7 The user comments of are marked as 1, Positive . 25000 Reviews for training set ,25,000 Bar for test set
One 、 Data set loading and data set preprocessing
# load IMDB Data sets , Data is digitally encoded , A number represents a word
(x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words)
print(x_train.shape, len(x_train[0]), y_train.shape) # (25000,) 218 (25000,)
print(x_test.shape, len(x_test[0]), y_test.shape) # (25000,) 68 (25000,)
# Cut and fill sentences , Make the length equal to max_review_len, Keep the last part of this long sentence , Fill the short sentence in front
x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len)
x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len)
# Build data set , Break up , Batch , And losing the last one is not enough batches Of batch
db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
db_train = db_train.shuffle(1000).batch(batches, drop_remainder=True)
db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
db_test = db_test.batch(batches, drop_remainder=True)
print('db_train:', db_train) # db_train: <BatchDataset shapes: ((128, 80), (128,)), types: (tf.int32, tf.int64)>adopt Keras Data set provided datasets It can be loaded from the Internet IMDB Data sets , The data set consists of 50000 User comments , Half for training , The other half is for testing .x_train and x_test It's a length of 25,000 One dimensional array of , Each element of the array is an indefinite length List, Save each sentence with numeric code , For example, the first sentence of the training set has 218 Word , The first sentence of the test set has 68 Word , Each sentence contains the beginning of the sentence ID.
For sentences with uneven length , Set a threshold , For sentences longer than this , Choose to truncate some words , You can choose to truncate the first word of a sentence , You can also cut off the words at the end of the sentence ; For sentences less than this length , You can choose to fill in at the beginning or end of a sentence , The sentence truncation function can be achieved by keras.preprocessing.sequence.pad_sequences() function Realization
Cut or fill to the same length , adopt Dataset Class wrapped into a dataset object , And add common data set processing flow , For example, data set batching , Break up , When the last batch of data sets does not meet a batch Throw it away .
Two 、 Network model building
Custom network model class MyRNN: Embedding layer --> Two RNN layer --> Classification layer network
class MyRNN(keras.Model):
def __init__(self, units):
super(MyRNN, self).__init__()
# Word vector coding [b,80] ==> [b,80,100]
# embedding_len: Length of word vector ,total_words: The number of words max_review_len: Enter the sentence length
self.embedding = layers.Embedding(total_words, embedding_len, input_length=max_review_len)
# structure 2 individual Cell [b,80,100] => [b,64]
self.rnn = Sequential([
layers.SimpleRNN(units, dropout=0.5, return_sequences=True),
layers.SimpleRNN(units, dropout=0.5)
])
# Build a classification network , Is used to Cell The output characteristics of ,2 classification
# [b,64]=> [b,1]
self.outlayer = Sequential([
layers.Dense(units),
layers.Dropout(rate=0.5),
layers.ReLU(),
layers.Dense(1)
])
# Forward calculation
def call(self, inputs, training=None):
x = inputs # [b,80]
# embedding: [b,80] ==> [b,80, 100]
x = self.embedding(x)
# rnn cell compute: [b, 80,100] => [b,64]
out1 = self.rnn(x)
# The last output of the end layer is used as the input of the classification network : [b, 64] => [b, 1]
x = self.outlayer(out1, training)
# p(y is pos|x)
prob = tf.sigmoid(x)
return prob
Embedding Layer action : Encode a word into a word vector , It accepts numerically encoded word numbers , It's trainable
RNN layer : Each layer requires the state output of the previous layer on each timestamp , So except for the last layer , be-all RNN Each layer needs to return the status output above each timestamp , By setting return_sequences=True To achieve , Used as input for the next level ,dropout Used to optimize network performance : Reduce the connection between layers
Classification network completed 2 Classification task , Therefore, the output node is set to 1. The input sequence passes Embedding Layer completes word vector coding , Loop through two RNN layer , Extracting semantic features , Take the state vector output of the last timestamp of the last layer and send it to the classification network , after Sigmoid The output probability is obtained after activating the function
adopt Cell The way :
class MyRNN(keras.Model):
def __init__(self, units):
super(MyRNN, self).__init__()
# [b, 64], structure Cell Initialize the state vector , Reuse
self.state0 = [tf.zeros([batches, units])]
self.state1 = [tf.zeros([batches, units])]
# Word vector coding [b,80] ==> [b,80,100]
self.embedding = layers.Embedding(total_words, embedding_len, input_length=max_review_len)
# structure 2 individual Cell
self.rnn_cell0 = layers.SimpleRNNCell(units, dropout=0.5) # dropout Reduce connections
self.rnn_cell1 = layers.SimpleRNNCell(units, dropout=0.5)
# Build a classification network , Is used to Cell The output characteristics of ,2 classification
# [b,80,100] => [b,64]=> [b,1]
self.outlayer = Sequential([
layers.Dense(units),
layers.Dropout(rate=0.5),
layers.ReLU(),
layers.Dense(1)
])
# Forward calculation
def call(self, inputs, training=None):
x = inputs # [b,80]
# embedding: [b,80] ==> [b,80, 100]
x = self.embedding(x)
# rnn cell compute: [b, 80,100] => [b,64]
state0 = self.state0
state1 = self.state1
for word in tf.unstack(x, axis=1): # word: [b,100] Expand from the time dimension
out0, state0 = self.rnn_cell0(word, state0, training)
out1, state1 = self.rnn_cell1(out0, state1, training)
# The last output of the end layer is used as the input of the classification network : [b, 64] => [b, 1]
x = self.outlayer(out1, training)
# p(y is pos|x)
prob = tf.sigmoid(x)
return probYou need to implement forward calculation by yourself , And maintain each RNN The initial state vector of the layer , Everything else is the same
3、 ... and 、 Network assembly
model = MyRNN(units)
# Assemble the optimizer , Learning rate , Measurer
model.compile(optimizer=optimizers.Adam(1e-3),
loss=losses.BinaryCrossentropy(),
metrics=['accuracy'])After building the network model , You need to specify the optimizer object used by the network 、 Loss function type , Setting of evaluation indicators, etc , This step is called assembly For simplicity , Use here Keras Of Compile&Fit Way to train the network , Set the optimizer to Adam optimal
Four 、 Training and verification
# Training And verification validation_data: Validation data
model.fit(db_train, epochs=epochs, validation_data=db_test)
# test
model.evaluate(db_test)Use here Keras Of Compile&Fit Way to train the network , Set up the optimizer , Learning rate , Error function test index ( Adopt accuracy ), Use between fit() Feed data sets and test sets to train
result : Training nearly 30 individual epoch

5、 ... and 、LSTM Model
You only need to modify the network model : Modify the network layer type
# structure rnn
self.rnn = Sequential([
layers.LSTM(units, dropout=0.5, return_sequences=True),
layers.LSTM(units, dropout=0.5)
])result :LSTM The obvious effect is better than RNN A little bit better.

6、 ... and 、GRU Model
You only need to modify the network model : Modify the network layer type
# structure rnn
self.rnn = Sequential([
layers.GRU(units, dropout=0.5, return_sequences=True),
layers.GRU(units, dropout=0.5)
])result : Also slightly better than SampleRNN good :

7、 ... and 、 The whole program
# -*- codeing = utf-8 -*-
# @Time : 10:20
# @Author:Paranipd
# @File : imdb_rnn_cell.py
# @Software:PyCharm
import os
import tensorflow as tf
import numpy as np
from tensorflow import keras
from tensorflow.keras import Sequential, Model, layers, metrics, optimizers, losses
tf.random.set_seed(22)
np.random.seed(22)
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
assert tf.__version__.startswith('2')
batches = 128 # Batch size
total_words = 10000 # Vocabulary size N_vocad
max_review_len = 80 # The maximum length of a sentence s, Parts greater than will be truncated , Less than will fill
embedding_len = 100 # Word vector feature length
# load IMDB Data sets , Data is digitally encoded , A number represents a word
(x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words)
print(x_train.shape, len(x_train[0]), y_train.shape) # (25000,) 218 (25000,)
print(x_test.shape, len(x_test[0]), y_test.shape) # (25000,) 68 (25000,)
# Cut and fill sentences , Make the length equal to max_review_len, Keep the last part of this long sentence , Fill the short sentence in front
x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len)
x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len)
# Build data set , Break up , Batch , And losing the last one is not enough batches Of batch
db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
db_train = db_train.shuffle(1000).batch(batches, drop_remainder=True)
db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
db_test = db_test.batch(batches, drop_remainder=True)
print('db_train:', db_train) # db_train: <BatchDataset shapes: ((128, 80), (128,)), types: (tf.int32, tf.int64)>
class MyRNN(keras.Model):
def __init__(self, units):
super(MyRNN, self).__init__()
# Word vector coding [b,80] ==> [b,80,100]
# embedding_len: Length of word vector ,total_words: The number of words max_review_len: Enter the sentence length
self.embedding = layers.Embedding(total_words, embedding_len, input_length=max_review_len)
# structure 2 individual Cell [b,80,100] => [b,64]
self.rnn = Sequential([
layers.SimpleRNN(units, dropout=0.5, return_sequences=True),
layers.SimpleRNN(units, dropout=0.5)
])
# Build a classification network , Is used to Cell The output characteristics of ,2 classification
# [b,64]=> [b,1]
self.outlayer = Sequential([
layers.Dense(units),
layers.Dropout(rate=0.5),
layers.ReLU(),
layers.Dense(1)
])
# Forward calculation
def call(self, inputs, training=None):
x = inputs # [b,80]
# embedding: [b,80] ==> [b,80, 100]
x = self.embedding(x)
# rnn cell compute: [b, 80,100] => [b,64]
out1 = self.rnn(x)
# The last output of the end layer is used as the input of the classification network : [b, 64] => [b, 1]
x = self.outlayer(out1, training)
# p(y is pos|x)
prob = tf.sigmoid(x)
return prob
def main():
units = 64 # rnn State vector length
epochs = 50
model = MyRNN(units)
# Assemble the optimizer , Learning rate , Measurer
model.compile(optimizer=optimizers.Adam(1e-3),
loss=losses.BinaryCrossentropy(),
metrics=['accuracy'])
# Training And verification validation_data: Validation data
model.fit(db_train, epochs=epochs, validation_data=db_test)
# test
model.evaluate(db_test)
if __name__ == '__main__':
main()
边栏推荐
- Is it safe to buy funds on the brokerage account
- Operation and maintenance management system, humanized operation experience
- JAX的深度学习和科学计算
- base64
- C# 读写自定义的Config文件
- [programming training 2] sorting subsequence + inverted string
- Redisson uses the full solution - redisson official documents + comments (Part 2)
- 2022制冷与空调设备运行操作国家题库模拟考试平台操作
- How relational databases work
- 微软宣布开源 (GODEL) 语言模型聊天机器人
猜你喜欢
随机推荐
PWN attack and defense world int_ overflow
Autosar 学习记录(1) – EcuM_Init
[skill] create Bat quick open web page
Ctfhub port scan (SSRF)
Inftnews | from "avalanche" to Baidu "xirang", 16 major events of the meta universe in 30 years
【R语言】年龄性别频数匹配 挑选样本 病例对照研究,对年龄性别进行频数匹配
How relational databases work
Jax's deep learning and scientific computing
Discussion on several research hotspots of cvpr2022
The triode is a great invention
关系数据库如何工作
[软件] phantomjs屏幕截图
如何让两融交易更极速
[Shenzhen IO] precise Food Scale (some understanding of assembly language)
Todolist classic case ①
[untitled]
Redisson watchdog mechanism, redisson watchdog performance problems, redisson source code analysis
Download xshell and xftp
组件的自定义事件②
How to create an exclusive vs Code theme









