当前位置:网站首页>IMDB practice of emotion classification (simplernn, LSTM, Gru)
IMDB practice of emotion classification (simplernn, LSTM, Gru)
2022-07-01 07:43:00 【Programming bear】
Use the classic IMDB Movie review data set to complete the emotional classification task . IMDB The movie review dataset contains 50000 User comments , The label of evaluation is divided into negative and positive , among IMDB The rating <5 The user comments of are marked as 0, Negative ; IMDB evaluation >=7 The user comments of are marked as 1, Positive . 25000 Reviews for training set ,25,000 Bar for test set
One 、 Data set loading and data set preprocessing
# load IMDB Data sets , Data is digitally encoded , A number represents a word
(x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words)
print(x_train.shape, len(x_train[0]), y_train.shape) # (25000,) 218 (25000,)
print(x_test.shape, len(x_test[0]), y_test.shape) # (25000,) 68 (25000,)
# Cut and fill sentences , Make the length equal to max_review_len, Keep the last part of this long sentence , Fill the short sentence in front
x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len)
x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len)
# Build data set , Break up , Batch , And losing the last one is not enough batches Of batch
db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
db_train = db_train.shuffle(1000).batch(batches, drop_remainder=True)
db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
db_test = db_test.batch(batches, drop_remainder=True)
print('db_train:', db_train) # db_train: <BatchDataset shapes: ((128, 80), (128,)), types: (tf.int32, tf.int64)>adopt Keras Data set provided datasets It can be loaded from the Internet IMDB Data sets , The data set consists of 50000 User comments , Half for training , The other half is for testing .x_train and x_test It's a length of 25,000 One dimensional array of , Each element of the array is an indefinite length List, Save each sentence with numeric code , For example, the first sentence of the training set has 218 Word , The first sentence of the test set has 68 Word , Each sentence contains the beginning of the sentence ID.
For sentences with uneven length , Set a threshold , For sentences longer than this , Choose to truncate some words , You can choose to truncate the first word of a sentence , You can also cut off the words at the end of the sentence ; For sentences less than this length , You can choose to fill in at the beginning or end of a sentence , The sentence truncation function can be achieved by keras.preprocessing.sequence.pad_sequences() function Realization
Cut or fill to the same length , adopt Dataset Class wrapped into a dataset object , And add common data set processing flow , For example, data set batching , Break up , When the last batch of data sets does not meet a batch Throw it away .
Two 、 Network model building
Custom network model class MyRNN: Embedding layer --> Two RNN layer --> Classification layer network
class MyRNN(keras.Model):
def __init__(self, units):
super(MyRNN, self).__init__()
# Word vector coding [b,80] ==> [b,80,100]
# embedding_len: Length of word vector ,total_words: The number of words max_review_len: Enter the sentence length
self.embedding = layers.Embedding(total_words, embedding_len, input_length=max_review_len)
# structure 2 individual Cell [b,80,100] => [b,64]
self.rnn = Sequential([
layers.SimpleRNN(units, dropout=0.5, return_sequences=True),
layers.SimpleRNN(units, dropout=0.5)
])
# Build a classification network , Is used to Cell The output characteristics of ,2 classification
# [b,64]=> [b,1]
self.outlayer = Sequential([
layers.Dense(units),
layers.Dropout(rate=0.5),
layers.ReLU(),
layers.Dense(1)
])
# Forward calculation
def call(self, inputs, training=None):
x = inputs # [b,80]
# embedding: [b,80] ==> [b,80, 100]
x = self.embedding(x)
# rnn cell compute: [b, 80,100] => [b,64]
out1 = self.rnn(x)
# The last output of the end layer is used as the input of the classification network : [b, 64] => [b, 1]
x = self.outlayer(out1, training)
# p(y is pos|x)
prob = tf.sigmoid(x)
return prob
Embedding Layer action : Encode a word into a word vector , It accepts numerically encoded word numbers , It's trainable
RNN layer : Each layer requires the state output of the previous layer on each timestamp , So except for the last layer , be-all RNN Each layer needs to return the status output above each timestamp , By setting return_sequences=True To achieve , Used as input for the next level ,dropout Used to optimize network performance : Reduce the connection between layers
Classification network completed 2 Classification task , Therefore, the output node is set to 1. The input sequence passes Embedding Layer completes word vector coding , Loop through two RNN layer , Extracting semantic features , Take the state vector output of the last timestamp of the last layer and send it to the classification network , after Sigmoid The output probability is obtained after activating the function
adopt Cell The way :
class MyRNN(keras.Model):
def __init__(self, units):
super(MyRNN, self).__init__()
# [b, 64], structure Cell Initialize the state vector , Reuse
self.state0 = [tf.zeros([batches, units])]
self.state1 = [tf.zeros([batches, units])]
# Word vector coding [b,80] ==> [b,80,100]
self.embedding = layers.Embedding(total_words, embedding_len, input_length=max_review_len)
# structure 2 individual Cell
self.rnn_cell0 = layers.SimpleRNNCell(units, dropout=0.5) # dropout Reduce connections
self.rnn_cell1 = layers.SimpleRNNCell(units, dropout=0.5)
# Build a classification network , Is used to Cell The output characteristics of ,2 classification
# [b,80,100] => [b,64]=> [b,1]
self.outlayer = Sequential([
layers.Dense(units),
layers.Dropout(rate=0.5),
layers.ReLU(),
layers.Dense(1)
])
# Forward calculation
def call(self, inputs, training=None):
x = inputs # [b,80]
# embedding: [b,80] ==> [b,80, 100]
x = self.embedding(x)
# rnn cell compute: [b, 80,100] => [b,64]
state0 = self.state0
state1 = self.state1
for word in tf.unstack(x, axis=1): # word: [b,100] Expand from the time dimension
out0, state0 = self.rnn_cell0(word, state0, training)
out1, state1 = self.rnn_cell1(out0, state1, training)
# The last output of the end layer is used as the input of the classification network : [b, 64] => [b, 1]
x = self.outlayer(out1, training)
# p(y is pos|x)
prob = tf.sigmoid(x)
return probYou need to implement forward calculation by yourself , And maintain each RNN The initial state vector of the layer , Everything else is the same
3、 ... and 、 Network assembly
model = MyRNN(units)
# Assemble the optimizer , Learning rate , Measurer
model.compile(optimizer=optimizers.Adam(1e-3),
loss=losses.BinaryCrossentropy(),
metrics=['accuracy'])After building the network model , You need to specify the optimizer object used by the network 、 Loss function type , Setting of evaluation indicators, etc , This step is called assembly For simplicity , Use here Keras Of Compile&Fit Way to train the network , Set the optimizer to Adam optimal
Four 、 Training and verification
# Training And verification validation_data: Validation data
model.fit(db_train, epochs=epochs, validation_data=db_test)
# test
model.evaluate(db_test)Use here Keras Of Compile&Fit Way to train the network , Set up the optimizer , Learning rate , Error function test index ( Adopt accuracy ), Use between fit() Feed data sets and test sets to train
result : Training nearly 30 individual epoch

5、 ... and 、LSTM Model
You only need to modify the network model : Modify the network layer type
# structure rnn
self.rnn = Sequential([
layers.LSTM(units, dropout=0.5, return_sequences=True),
layers.LSTM(units, dropout=0.5)
])result :LSTM The obvious effect is better than RNN A little bit better.

6、 ... and 、GRU Model
You only need to modify the network model : Modify the network layer type
# structure rnn
self.rnn = Sequential([
layers.GRU(units, dropout=0.5, return_sequences=True),
layers.GRU(units, dropout=0.5)
])result : Also slightly better than SampleRNN good :

7、 ... and 、 The whole program
# -*- codeing = utf-8 -*-
# @Time : 10:20
# @Author:Paranipd
# @File : imdb_rnn_cell.py
# @Software:PyCharm
import os
import tensorflow as tf
import numpy as np
from tensorflow import keras
from tensorflow.keras import Sequential, Model, layers, metrics, optimizers, losses
tf.random.set_seed(22)
np.random.seed(22)
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
assert tf.__version__.startswith('2')
batches = 128 # Batch size
total_words = 10000 # Vocabulary size N_vocad
max_review_len = 80 # The maximum length of a sentence s, Parts greater than will be truncated , Less than will fill
embedding_len = 100 # Word vector feature length
# load IMDB Data sets , Data is digitally encoded , A number represents a word
(x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=total_words)
print(x_train.shape, len(x_train[0]), y_train.shape) # (25000,) 218 (25000,)
print(x_test.shape, len(x_test[0]), y_test.shape) # (25000,) 68 (25000,)
# Cut and fill sentences , Make the length equal to max_review_len, Keep the last part of this long sentence , Fill the short sentence in front
x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_len)
x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_len)
# Build data set , Break up , Batch , And losing the last one is not enough batches Of batch
db_train = tf.data.Dataset.from_tensor_slices((x_train, y_train))
db_train = db_train.shuffle(1000).batch(batches, drop_remainder=True)
db_test = tf.data.Dataset.from_tensor_slices((x_test, y_test))
db_test = db_test.batch(batches, drop_remainder=True)
print('db_train:', db_train) # db_train: <BatchDataset shapes: ((128, 80), (128,)), types: (tf.int32, tf.int64)>
class MyRNN(keras.Model):
def __init__(self, units):
super(MyRNN, self).__init__()
# Word vector coding [b,80] ==> [b,80,100]
# embedding_len: Length of word vector ,total_words: The number of words max_review_len: Enter the sentence length
self.embedding = layers.Embedding(total_words, embedding_len, input_length=max_review_len)
# structure 2 individual Cell [b,80,100] => [b,64]
self.rnn = Sequential([
layers.SimpleRNN(units, dropout=0.5, return_sequences=True),
layers.SimpleRNN(units, dropout=0.5)
])
# Build a classification network , Is used to Cell The output characteristics of ,2 classification
# [b,64]=> [b,1]
self.outlayer = Sequential([
layers.Dense(units),
layers.Dropout(rate=0.5),
layers.ReLU(),
layers.Dense(1)
])
# Forward calculation
def call(self, inputs, training=None):
x = inputs # [b,80]
# embedding: [b,80] ==> [b,80, 100]
x = self.embedding(x)
# rnn cell compute: [b, 80,100] => [b,64]
out1 = self.rnn(x)
# The last output of the end layer is used as the input of the classification network : [b, 64] => [b, 1]
x = self.outlayer(out1, training)
# p(y is pos|x)
prob = tf.sigmoid(x)
return prob
def main():
units = 64 # rnn State vector length
epochs = 50
model = MyRNN(units)
# Assemble the optimizer , Learning rate , Measurer
model.compile(optimizer=optimizers.Adam(1e-3),
loss=losses.BinaryCrossentropy(),
metrics=['accuracy'])
# Training And verification validation_data: Validation data
model.fit(db_train, epochs=epochs, validation_data=db_test)
# test
model.evaluate(db_test)
if __name__ == '__main__':
main()
边栏推荐
- 【R语言】年龄性别频数匹配 挑选样本 病例对照研究,对年龄性别进行频数匹配
- We found a huge hole in MySQL: do not judge the number of rows affected by update!!!
- [programming training 2] sorting subsequence + inverted string
- 【目标检测】目标检测界的扛把子YOLOv5(原理详解+修炼指南)
- Software testing methods and techniques - overview of basic knowledge
- [MySQL learning notes 25] SQL statement optimization
- [R language] two /n data merge functions
- redisson使用全解——redisson官方文档+注释(上篇)
- ctfshow-web355,356(SSRF)
- 漏刻有时API接口实战开发系列(14):身份证实名鉴权验证
猜你喜欢

ONES 创始人王颖奇对话《财富》(中文版):中国有没有优秀的软件?

浏览器本地存储

PWN攻防世界int_overflow

【mysql学习笔记25】sql语句优化

Jax's deep learning and scientific computing

redisson使用全解——redisson官方文档+注释(中篇)

JAX的深度学习和科学计算

How to make the two financial transactions faster

Inftnews | from "avalanche" to Baidu "xirang", 16 major events of the meta universe in 30 years

Image style migration cyclegan principle
随机推荐
2022 test questions and mock examinations for main principals of hazardous chemicals business units
[target detection] yolov5, the shoulder of target detection (detailed principle + Training Guide)
继妹变继母,儿子与自己断绝关系,世界首富马斯克,为何这么惨?
Solution to the problem that objects in unity2021 scene view cannot be directly selected
Subclasses call methods and properties of the parent class with the same name
Microsoft announces open source (Godel) language model chat robot
Redisson uses the full solution - redisson official document + comments (Part 2)
The H5 page has set the font thickness style, but the wechat access style in Huawei mobile phone doesn't take effect?
华为ModelArts训练Alexnet模型
[MySQL learning notes27] stored procedure
Eigen矩阵运算库快速上手
PWN攻防世界int_overflow
论文学习——水文时间序列相似性查询的分析与研究
Which securities company is better or safer for mobile phone account opening
2022 tea master (intermediate) recurrent training question bank and answers
[R language] two /n data merge functions
力扣——求一组字符中的第一个回文字符
ctfshow-web351(SSRF)
Custom events of components ①
Discussion on several research hotspots of cvpr2022