Normalize length
This function converts a list (of length num_samples) of sequences (lists of integers) to a shape of (num_samples, num_timesteps) 2D Numpy array.num_timesteps is the maxlen parameter (if provided), or the longest sequence in the listlength.
shorter than num_timestepsThe sequences are filled with values until they are num_timesteps.
Longer than num_timestepsThe sequence will be truncated to fit the desired length.
The position where padding or truncation occurs is determined by parameter padding and truncation, respectively.Prefilling or removing values from the beginning of the sequence is the default.
tf.keras.utils.pad_sequences(sequences, # sequence lengthmaxlen=None, # optional Int, maximum length of all sequences.If not provided, the sequence will be padded to the length of the longest single sequence.dtype='int32', # optional, defaults to "int32".Type of output sequence.To pad a sequence with variable-length strings, you can use object.padding='pre', # string, "pre" or "post" (optional, defaults to "pre"): padding before or after each sequence.truncating='pre', # string, "pre" or "post" (optional, defaults to "pre"): Remove values from sequences greater than maxlen, whether at the beginning or end of the sequence.value=0.0 # float or string, fill value.(Optional, defaults to 0.))
Return value
Numpy array with shape (len(sequences), maxlen)
import tensorflow as tf # import tensorflowsequence = [[1], [2, 3], [4, 5, 6]] # input sequencetf.keras.preprocessing.sequence.pad_sequences(sequence) # length normalization
array([[0, 0, 1],[0, 2, 3],[4, 5, 6]])
import tensorflow as tf # import tensorflowsequence = [[1], [2, 3], [4, 5, 6]] # input sequencetf.keras.preprocessing.sequence.pad_sequences(sequence, padding='post') # length normalization
array([[1, 0, 0],[2, 3, 0],[4, 5, 6]])
Main reference: tf.keras.utils.pad_sequences | TensorFlow Core v2.9.1 (google.cn)
