当前位置:网站首页>Torch Cookbook
Torch Cookbook
2022-07-06 21:16:00 【Love CV】
This code is based on PyTorch 1.0 edition , The following packages are needed
import collections
import os
import shutil
import tqdm
import numpy as np
import PIL.Image
import torch
import torchvision
1. Basic configuration
Check PyTorch edition
torch.__version__ # PyTorch version
torch.version.cuda # Corresponding CUDA version
torch.backends.cudnn.version() # Corresponding cuDNN version
torch.cuda.get_device_name(0) # GPU type
to update PyTorch
PyTorch Will be installed in anaconda3/lib/python3.7/site-packages/torch/ Under the table of contents .
conda update pytorch torchvision -c pytorch
Fix random seeds
torch.manual_seed(0)
torch.cuda.manual_seed_all(0)
Specifies that the program runs on a specific GPU obstruct
Specify the environment variable... On the command line
CUDA_VISIBLE_DEVICES=0,1 python train.py
Or specify... In the code
os.environ['CUDA_VISIBLE_DEVICES'] = '0,1'
To determine if there is CUDA Support
torch.cuda.is_available()
Set to cuDNN benchmark Pattern
Benchmark Mode will speed up the calculation , But because of the randomness in the calculation , The feedforward results of each network are slightly different .
torch.backends.cudnn.benchmark = True
If you want to avoid this kind of result fluctuation , Set up
torch.backends.cudnn.deterministic = True
eliminate GPU Storage
Sometimes Control-C After suspension of operation GPU Storage not released in time , It needs to be cleared manually . stay PyTorch The interior can
torch.cuda.empty_cache()
Or at the command line, you can use ps Find the program PID, Reuse kill End the process
ps aux | grep python
kill -9 [pid]
Or directly reset those that are not cleared GPU
nvidia-smi --gpu-reset -i [gpu_id]
2. Tensor processing
Tensor basic information
tensor.type() # Data type
tensor.size() # Shape of the tensor. It is a subclass of Python tuple
tensor.dim() # Number of dimensions.
Data type conversion
# Set default tensor type. Float in PyTorch is much faster than double.
torch.set_default_tensor_type(torch.FloatTensor)
# Type convertions.
tensor = tensor.cuda()
tensor = tensor.cpu()
tensor = tensor.float()
tensor = tensor.long()
torch.Tensor And np.ndarray transformation
# torch.Tensor -> np.ndarray.
ndarray = tensor.cpu().numpy()
# np.ndarray -> torch.Tensor.
tensor = torch.from_numpy(ndarray).float()
tensor = torch.from_numpy(ndarray.copy()).float() # If ndarray has negative stride
torch.Tensor And PIL.Image transformation
PyTorch The tensor in the defaults to N×D×H×W The order of , And the data range is [0, 1], It needs to be transposed and normalized .
# torch.Tensor -> PIL.Image.
image = PIL.Image.fromarray(torch.clamp(tensor * 255, min=0, max=255
).byte().permute(1, 2, 0).cpu().numpy())
image = torchvision.transforms.functional.to_pil_image(tensor) # Equivalently way
# PIL.Image -> torch.Tensor.
tensor = torch.from_numpy(np.asarray(PIL.Image.open(path))
).permute(2, 0, 1).float() / 255
tensor = torchvision.transforms.functional.to_tensor(PIL.Image.open(path)) # Equivalently way
np.ndarray And PIL.Image transformation
# np.ndarray -> PIL.Image.
image = PIL.Image.fromarray(ndarray.astypde(np.uint8))
# PIL.Image -> np.ndarray.
ndarray = np.asarray(PIL.Image.open(path))
Extract values from a tensor that contains only one element
This counts during training loss It's especially useful in the process of change . Otherwise, it will accumulate the calculation chart , send GPU More and more storage .
value = tensor.item()
Tensor deformation
Tensor deformation is often used to input the convolution features into the fully connected layer . comparison torch.view,torch.reshape It can automatically handle the discontinuous input tensor .
tensor = torch.reshape(tensor, shape)
Out of order
tensor = tensor[torch.randperm(tensor.size(0))] # Shuffle the first dimension
Flip horizontal
PyTorch I won't support it tensor[::-1] This negative step operation , Horizontal flipping can be realized by tensor index .
# Assume tensor has shape N*D*H*W.
tensor = tensor[:, :, :, torch.arange(tensor.size(3) - 1, -1, -1).long()]
Copy tensor
There are three ways of copying , Corresponding to different needs .
# Operation | New/Shared memory | Still in computation graph |
tensor.clone() # | New | Yes |
tensor.detach() # | Shared | No |
tensor.detach.clone()() # | New | No |
Splicing tensor
Be careful torch.cat and torch.stack The difference is that torch.cat Stitching along a given dimension , and torch.stack One dimension will be added . For example, if the parameter is 3 individual 10×5 Tensor ,torch.cat The result is 30×5 Tensor , and torch.stack The result is 3×10×5 Tensor .
tensor = torch.cat(list_of_tensors, dim=0)
tensor = torch.stack(list_of_tensors, dim=0)
Convert integer tags to unique heat (one-hot) code
PyTorch The tags in the default from 0 Start .
N = tensor.size(0)
one_hot = torch.zeros(N, num_classes).long()
one_hot.scatter_(dim=1, index=torch.unsqueeze(tensor, dim=1), src=torch.ones(N, num_classes).long())
Get non-zero / Zero element
torch.nonzero(tensor) # Index of non-zero elements
torch.nonzero(tensor == 0) # Index of zero elements
torch.nonzero(tensor).size(0) # Number of non-zero elements
torch.nonzero(tensor == 0).size(0) # Number of zero elements
Judge that two tensors are equal
torch.allclose(tensor1, tensor2) # float tensor
torch.equal(tensor1, tensor2) # int tensor
Tensor expansion
# Expand tensor of shape 64*512 to shape 64*512*7*7.
torch.reshape(tensor, (64, 512, 1, 1)).expand(64, 512, 7, 7)
Matrix multiplication
# Matrix multiplication: (m*n) * (n*p) -> (m*p).
result = torch.mm(tensor1, tensor2)
# Batch matrix multiplication: (b*m*n) * (b*n*p) -> (b*m*p).
result = torch.bmm(tensor1, tensor2)
# Element-wise multiplication.
result = tensor1 * tensor2
Calculate the distance between two sets of data
# X1 is of shape m*d, X2 is of shape n*d.
dist = torch.sqrt(torch.sum((X1[:,None,:] - X2) ** 2, dim=2))
3. Model definition
Convolution layer
The most common convolution layer configuration is
conv = torch.nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=1, padding=1, bias=True)
conv = torch.nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=1, padding=0, bias=True)
If the roll up layer configuration is complex , It is not convenient to calculate the output size , The following visualization tools can be used to assist
Convolution Visualizer
https://ezyang.github.io/convolution-visualizer/index.html
GAP(Global average pooling) layer
gap = torch.nn.AdaptiveAvgPool2d(output_size=1)
Bilinear convergence (bilinear pooling)[1]
X = torch.reshape(N, D, H * W) # Assume X has shape N*D*H*W
X = torch.bmm(X, torch.transpose(X, 1, 2)) / (H * W) # Bilinear pooling
assert X.size() == (N, D, D)
X = torch.reshape(X, (N, D * D))
X = torch.sign(X) * torch.sqrt(torch.abs(X) + 1e-5) # Signed-sqrt normalization
X = torch.nn.functional.normalize(X) # L2 normalization
Multi card synchronization BN(Batch normalization)
When using torch.nn.DataParallel Run the code on multiple GPU On the card ,PyTorch Of BN Layer default operation is to calculate the mean value and standard deviation of data on each card independently , Sync BN Use all the data on the card to calculate BN The mean and standard deviation of layers , Ease when batch size (batch size) Compare the situation when the mean value and standard deviation are not estimated correctly , It is an effective skill to improve performance in the task of target detection .
Synchronized-BatchNorm-PyTorch
https://github.com/vacancy/Synchronized-BatchNorm-PyTorch
Now? PyTorch Officials have supported synchronization BN operation
sync_bn = torch.nn.SyncBatchNorm(num_features, eps=1e-05, momentum=0.1, affine=True,
track_running_stats=True)
All of the existing network BN Change layer to synchronization BN layer
def convertBNtoSyncBN(module, process_group=None):
'''Recursively replace all BN layers to SyncBN layer.
Args:
module[torch.nn.Module]. Network
'''
if isinstance(module, torch.nn.modules.batchnorm._BatchNorm):
sync_bn = torch.nn.SyncBatchNorm(module.num_features, module.eps, module.momentum,
module.affine, module.track_running_stats, process_group)
sync_bn.running_mean = module.running_mean
sync_bn.running_var = module.running_var
if module.affine:
sync_bn.weight = module.weight.clone().detach()
sync_bn.bias = module.bias.clone().detach()
return sync_bn
else:
for name, child_module in module.named_children():
setattr(module, name) = convert_syncbn_model(child_module, process_group=process_group))
return module
similar BN moving average
If you want to achieve something similar BN Sliding average operation , stay forward Function to use in place (inplace) The operation assigns a value to the sliding average .
class BN(torch.nn.Module)
def __init__(self):
...
self.register_buffer('running_mean', torch.zeros(num_features))
def forward(self, X):
...
self.running_mean += momentum * (current - self.running_mean)
Calculate the overall parameters of the model
num_parameters = sum(torch.numel(parameter) for parameter in model.parameters())
similar Keras Of model.summary() Output model information
pytorch-summary
https://github.com/sksq96/pytorch-summary
Model weight initialization
Be careful model.modules() and model.children() The difference between :model.modules() Will iterate through all the sub layers of the model , and model.children() Only one layer under the model will be traversed .
# Common practise for initialization.
for layer in model.modules():
if isinstance(layer, torch.nn.Conv2d):
torch.nn.init.kaiming_normal_(layer.weight, mode='fan_out',
nonlinearity='relu')
if layer.bias is not None:
torch.nn.init.constant_(layer.bias, val=0.0)
elif isinstance(layer, torch.nn.BatchNorm2d):
torch.nn.init.constant_(layer.weight, val=1.0)
torch.nn.init.constant_(layer.bias, val=0.0)
elif isinstance(layer, torch.nn.Linear):
torch.nn.init.xavier_normal_(layer.weight)
if layer.bias is not None:
torch.nn.init.constant_(layer.bias, val=0.0)
# Initialization with given tensor.
layer.weight = torch.nn.Parameter(tensor)
Part of the layer uses the pre training model
Note that if the saved model is torch.nn.DataParallel, The current model also needs to be torch.nn.DataParallel.torch.nn.DataParallel(model).module == model.
model.load_state_dict(torch.load('model,pth'), strict=False)
Will be in GPU The saved model is loaded into CPU
model.load_state_dict(torch.load('model,pth', map_location='cpu'))
4. Data preparation 、 Feature extraction and fine tuning
Image fragmentation (image shuffle)/ Regional confusion mechanism (region confusion mechanism,RCM)[2]
# X is torch.Tensor of size N*D*H*W.
# Shuffle rows
Q = (torch.unsqueeze(torch.arange(num_blocks), dim=1) * torch.ones(1, num_blocks).long()
+ torch.randint(low=-neighbour, high=neighbour, size=(num_blocks, num_blocks)))
Q = torch.argsort(Q, dim=0)
assert Q.size() == (num_blocks, num_blocks)
X = [torch.chunk(row, chunks=num_blocks, dim=2)
for row in torch.chunk(X, chunks=num_blocks, dim=1)]
X = [[X[Q[i, j].item()][j] for j in range(num_blocks)]
for i in range(num_blocks)]
# Shulle columns.
Q = (torch.ones(num_blocks, 1).long() * torch.unsqueeze(torch.arange(num_blocks), dim=0)
+ torch.randint(low=-neighbour, high=neighbour, size=(num_blocks, num_blocks)))
Q = torch.argsort(Q, dim=1)
assert Q.size() == (num_blocks, num_blocks)
X = [[X[i][Q[i, j].item()] for j in range(num_blocks)]
for i in range(num_blocks)]
Y = torch.cat([torch.cat(row, dim=2) for row in X], dim=1)
Get the basic information of video data
import cv2
video = cv2.VideoCapture(mp4_path)
height = int(video.get(cv2.CAP_PROP_FRAME_HEIGHT))
width = int(video.get(cv2.CAP_PROP_FRAME_WIDTH))
num_frames = int(video.get(cv2.CAP_PROP_FRAME_COUNT))
fps = int(video.get(cv2.CAP_PROP_FPS))
video.release()
TSN Each paragraph (segment) Sample a video [3]
K = self._num_segments
if is_train:
if num_frames > K:
# Random index for each segment.
frame_indices = torch.randint(
high=num_frames // K, size=(K,), dtype=torch.long)
frame_indices += num_frames // K * torch.arange(K)
else:
frame_indices = torch.randint(
high=num_frames, size=(K - num_frames,), dtype=torch.long)
frame_indices = torch.sort(torch.cat((
torch.arange(num_frames), frame_indices)))[0]
else:
if num_frames > K:
# Middle index for each segment.
frame_indices = num_frames / K // 2
frame_indices += num_frames // K * torch.arange(K)
else:
frame_indices = torch.sort(torch.cat((
torch.arange(num_frames), torch.arange(K - num_frames))))[0]
assert frame_indices.size() == (K,)
return [frame_indices[i] for i in range(K)]
extract ImageNet The convolution characteristics of a certain layer in the pre training model
# VGG-16 relu5-3 feature.
model = torchvision.models.vgg16(pretrained=True).features[:-1]
# VGG-16 pool5 feature.
model = torchvision.models.vgg16(pretrained=True).features
# VGG-16 fc7 feature.
model = torchvision.models.vgg16(pretrained=True)
model.classifier = torch.nn.Sequential(*list(model.classifier.children())[:-3])
# ResNet GAP feature.
model = torchvision.models.resnet18(pretrained=True)
model = torch.nn.Sequential(collections.OrderedDict(
list(model.named_children())[:-1]))
with torch.no_grad():
model.eval()
conv_representation = model(image)
extract ImageNet The convolution characteristics of the pre training model
class FeatureExtractor(torch.nn.Module):
"""Helper class to extract several convolution features from the given
pre-trained model.
Attributes:
_model, torch.nn.Module.
_layers_to_extract, list<str> or set<str>
Example:
>>> model = torchvision.models.resnet152(pretrained=True)
>>> model = torch.nn.Sequential(collections.OrderedDict(
list(model.named_children())[:-1]))
>>> conv_representation = FeatureExtractor(
pretrained_model=model,
layers_to_extract={'layer1', 'layer2', 'layer3', 'layer4'})(image)
"""
def __init__(self, pretrained_model, layers_to_extract):
torch.nn.Module.__init__(self)
self._model = pretrained_model
self._model.eval()
self._layers_to_extract = set(layers_to_extract)
def forward(self, x):
with torch.no_grad():
conv_representation = []
for name, layer in self._model.named_children():
x = layer(x)
if name in self._layers_to_extract:
conv_representation.append(x)
return conv_representation
Other pre training models
pretrained-models.pytorch
https://github.com/Cadene/pretrained-models.pytorch
Fine tune the full connection layer
model = torchvision.models.resnet18(pretrained=True)
for param in model.parameters():
param.requires_grad = False
model.fc = nn.Linear(512, 100) # Replace the last fc layer
optimizer = torch.optim.SGD(model.fc.parameters(), lr=1e-2, momentum=0.9, weight_decay=1e-4)
Fine tune the full connection layer with a large learning rate , Small learning rate, fine-tuning volume layer
model = torchvision.models.resnet18(pretrained=True)
finetuned_parameters = list(map(id, model.fc.parameters()))
conv_parameters = (p for p in model.parameters() if id(p) not in finetuned_parameters)
parameters = [{'params': conv_parameters, 'lr': 1e-3},
{'params': model.fc.parameters()}]
optimizer = torch.optim.SGD(parameters, lr=1e-2, momentum=0.9, weight_decay=1e-4)
5. model training
Common training and validation data preprocessing
among ToTensor The operation will PIL.Image Or in the shape of H×W×D, The range of values is [0, 255] Of np.ndarray Convert to shape D×H×W, The range of values is [0.0, 1.0] Of torch.Tensor.
train_transform = torchvision.transforms.Compose([
torchvision.transforms.RandomResizedCrop(size=224,
scale=(0.08, 1.0)),
torchvision.transforms.RandomHorizontalFlip(),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406),
std=(0.229, 0.224, 0.225)),
])
val_transform = torchvision.transforms.Compose([
torchvision.transforms.Resize(256),
torchvision.transforms.CenterCrop(224),
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406),
std=(0.229, 0.224, 0.225)),
])
Train the basic code framework
for t in epoch(80):
for images, labels in tqdm.tqdm(train_loader, desc='Epoch %3d' % (t + 1)):
images, labels = images.cuda(), labels.cuda()
scores = model(images)
loss = loss_function(scores, labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
The marks are smooth (label smoothing)[4]
for images, labels in train_loader:
images, labels = images.cuda(), labels.cuda()
N = labels.size(0)
# C is the number of classes.
smoothed_labels = torch.full(size=(N, C), fill_value=0.1 / (C - 1)).cuda()
smoothed_labels.scatter_(dim=1, index=torch.unsqueeze(labels, dim=1), value=0.9)
score = model(images)
log_prob = torch.nn.functional.log_softmax(score, dim=1)
loss = -torch.sum(log_prob * smoothed_labels) / N
optimizer.zero_grad()
loss.backward()
optimizer.step()
Mixup[5]
beta_distribution = torch.distributions.beta.Beta(alpha, alpha)
for images, labels in train_loader:
images, labels = images.cuda(), labels.cuda()
# Mixup images.
lambda_ = beta_distribution.sample([]).item()
index = torch.randperm(images.size(0)).cuda()
mixed_images = lambda_ * images + (1 - lambda_) * images[index, :]
# Mixup loss.
scores = model(mixed_images)
loss = (lambda_ * loss_function(scores, labels)
+ (1 - lambda_) * loss_function(scores, labels[index]))
optimizer.zero_grad()
loss.backward()
optimizer.step()
L1 Regularization
l1_regularization = torch.nn.L1Loss(reduction='sum')
loss = ... # Standard cross-entropy loss
for param in model.parameters():
loss += lambda_ * torch.sum(torch.abs(param))
loss.backward()
Don't do the bias L2 Regularization / Weight attenuation (weight decay)
bias_list = (param for name, param in model.named_parameters() if name[-4:] == 'bias')
others_list = (param for name, param in model.named_parameters() if name[-4:] != 'bias')
parameters = [{'parameters': bias_list, 'weight_decay': 0},
{'parameters': others_list}]
optimizer = torch.optim.SGD(parameters, lr=1e-2, momentum=0.9, weight_decay=1e-4)
Gradient cut (gradient clipping)
torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=20)
Calculation Softmax The accuracy of the output
score = model(images)
prediction = torch.argmax(score, dim=1)
num_correct = torch.sum(prediction == labels).item()
accuruacy = num_correct / labels.size(0)
Visual model feed forward calculation chart
pytorchviz
https://github.com/szagoruyko/pytorchviz
Visual learning curve
Yes Facebook self-developed Visdom and Tensorboard( Still in the experimental stage ) Two choices .
facebookresearch/visdom
https://github.com/facebookresearch/visdom
Tensorboard
https://pytorch.org/docs/stable/tensorboard.html
# Example using Visdom.
vis = visdom.Visdom(env='Learning curve', use_incoming_socket=False)
assert self._visdom.check_connection()
self._visdom.close()
options = collections.namedtuple('Options', ['loss', 'acc', 'lr'])(
loss={'xlabel': 'Epoch', 'ylabel': 'Loss', 'showlegend': True},
acc={'xlabel': 'Epoch', 'ylabel': 'Accuracy', 'showlegend': True},
lr={'xlabel': 'Epoch', 'ylabel': 'Learning rate', 'showlegend': True})
for t in epoch(80):
tran(...)
val(...)
vis.line(X=torch.Tensor([t + 1]), Y=torch.Tensor([train_loss]),
name='train', win='Loss', update='append', opts=options.loss)
vis.line(X=torch.Tensor([t + 1]), Y=torch.Tensor([val_loss]),
name='val', win='Loss', update='append', opts=options.loss)
vis.line(X=torch.Tensor([t + 1]), Y=torch.Tensor([train_acc]),
name='train', win='Accuracy', update='append', opts=options.acc)
vis.line(X=torch.Tensor([t + 1]), Y=torch.Tensor([val_acc]),
name='val', win='Accuracy', update='append', opts=options.acc)
vis.line(X=torch.Tensor([t + 1]), Y=torch.Tensor([lr]),
win='Learning rate', update='append', opts=options.lr)
Get the current learning rate
# If there is one global learning rate (which is the common case).
lr = next(iter(optimizer.param_groups))['lr']
# If there are multiple learning rates for different layers.
all_lr = []
for param_group in optimizer.param_groups:
all_lr.append(param_group['lr'])
Learning rate decline
# Reduce learning rate when validation accuarcy plateau.
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='max', patience=5, verbose=True)
for t in range(0, 80):
train(...); val(...)
scheduler.step(val_acc)
# Cosine annealing learning rate.
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=80)
# Reduce learning rate by 10 at given epochs.
scheduler = torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones=[50, 70], gamma=0.1)
for t in range(0, 80):
scheduler.step()
train(...); val(...)
# Learning rate warmup by 10 epochs.
scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda=lambda t: t / 10)
for t in range(0, 10):
scheduler.step()
train(...); val(...)
Save and load breakpoints
Note that in order to be able to resume training , We need to keep the state of both the model and optimizer , And the current number of training rounds .
# Save checkpoint.
is_best = current_acc > best_acc
best_acc = max(best_acc, current_acc)
checkpoint = {
'best_acc': best_acc,
'epoch': t + 1,
'model': model.state_dict(),
'optimizer': optimizer.state_dict(),
}
model_path = os.path.join('model', 'checkpoint.pth.tar')
torch.save(checkpoint, model_path)
if is_best:
shutil.copy('checkpoint.pth.tar', model_path)
# Load checkpoint.
if resume:
model_path = os.path.join('model', 'checkpoint.pth.tar')
assert os.path.isfile(model_path)
checkpoint = torch.load(model_path)
best_acc = checkpoint['best_acc']
start_epoch = checkpoint['epoch']
model.load_state_dict(checkpoint['model'])
optimizer.load_state_dict(checkpoint['optimizer'])
print('Load checkpoint at epoch %d.' % start_epoch)
Computational accuracy 、 Precision rate (precision)、 Recall rate (recall)
# data['label'] and data['prediction'] are groundtruth label and prediction
# for each image, respectively.
accuracy = np.mean(data['label'] == data['prediction']) * 100
# Compute recision and recall for each class.
for c in range(len(num_classes)):
tp = np.dot((data['label'] == c).astype(int),
(data['prediction'] == c).astype(int))
tp_fp = np.sum(data['prediction'] == c)
tp_fn = np.sum(data['label'] == c)
precision = tp / tp_fp * 100
recall = tp / tp_fn * 100
6. Model test
Calculate the precision of each category (precision)、 Recall rate (recall)、F1 And overall indicators
import sklearn.metrics
all_label = []
all_prediction = []
for images, labels in tqdm.tqdm(data_loader):
# Data.
images, labels = images.cuda(), labels.cuda()
# Forward pass.
score = model(images)
# Save label and predictions.
prediction = torch.argmax(score, dim=1)
all_label.append(labels.cpu().numpy())
all_prediction.append(prediction.cpu().numpy())
# Compute RP and confusion matrix.
all_label = np.concatenate(all_label)
assert len(all_label.shape) == 1
all_prediction = np.concatenate(all_prediction)
assert all_label.shape == all_prediction.shape
micro_p, micro_r, micro_f1, _ = sklearn.metrics.precision_recall_fscore_support(
all_label, all_prediction, average='micro', labels=range(num_classes))
class_p, class_r, class_f1, class_occurence = sklearn.metrics.precision_recall_fscore_support(
all_label, all_prediction, average=None, labels=range(num_classes))
# Ci,j = #{y=i and hat_y=j}
confusion_mat = sklearn.metrics.confusion_matrix(
all_label, all_prediction, labels=range(num_classes))
assert confusion_mat.shape == (num_classes, num_classes)
Write various results into a spreadsheet
import csv
# Write results onto disk.
with open(os.path.join(path, filename), 'wt', encoding='utf-8') as f:
f = csv.writer(f)
f.writerow(['Class', 'Label', '# occurence', 'Precision', 'Recall', 'F1',
'Confused class 1', 'Confused class 2', 'Confused class 3',
'Confused 4', 'Confused class 5'])
for c in range(num_classes):
index = np.argsort(confusion_mat[:, c])[::-1][:5]
f.writerow([
label2class[c], c, class_occurence[c], '%4.3f' % class_p[c],
'%4.3f' % class_r[c], '%4.3f' % class_f1[c],
'%s:%d' % (label2class[index[0]], confusion_mat[index[0], c]),
'%s:%d' % (label2class[index[1]], confusion_mat[index[1], c]),
'%s:%d' % (label2class[index[2]], confusion_mat[index[2], c]),
'%s:%d' % (label2class[index[3]], confusion_mat[index[3], c]),
'%s:%d' % (label2class[index[4]], confusion_mat[index[4], c])])
f.writerow(['All', '', np.sum(class_occurence), micro_p, micro_r, micro_f1,
'', '', '', '', ''])
7. PyTorch Other matters needing attention
Model definition
It is recommended that there be layers and confluences with parameters (pooling) Layer using torch.nn Module definition , The activation function directly uses torch.nn.functional.torch.nn Module and torch.nn.functional The difference is that ,torch.nn Module in the calculation of the underlying call torch.nn.functional, but torch.nn The module includes the parameters of this layer , You can also deal with training and testing two network states . Use torch.nn.functional Pay attention to the network state , Such as
def forward(self, x):
...
x = torch.nn.functional.dropout(x, p=0.5, training=self.training)
model(x) Pre use model.train() and model.eval() Switch network status .
Do not need to calculate the gradient of the code block with torch.no_grad() Include .model.eval() and torch.no_grad() The difference is that ,model.eval() It is to switch the network to the test state , for example BN And random deactivation (dropout) Use different calculation methods during training and testing .torch.no_grad() It's closing PyTorch The automatic derivation mechanism of tensor , To reduce storage usage and accelerate Computing , The result obtained cannot be carried out loss.backward().
torch.nn.CrossEntropyLoss The input of does not need to go through Softmax.torch.nn.CrossEntropyLoss Equivalent to torch.nn.functional.log_softmax + torch.nn.NLLLoss.
loss.backward() Pre use optimizer.zero_grad() Clear the cumulative gradient .optimizer.zero_grad() and model.zero_grad() The effect is the same .
PyTorch Performance and debugging
torch.utils.data.DataLoader Try to set pin_memory=True, For very small data sets such as MNIST Set up pin_memory=False It's faster .num_workers We need to find the fastest value in the experiment .
use del Delete unused intermediate variables in time , save GPU Storage .
Use inplace Operation can save GPU Storage , Such as
x = torch.nn.functional.relu(x, inplace=True)
Besides , You can also use torch.utils.checkpoint In forward propagation, only some intermediate results are retained to save GPU Storage use , The content required in back propagation is calculated from the nearest intermediate results .
Reduce CPU and GPU Data transfer between . For example, if you want to know a epoch Each of them mini-batch Of loss And accuracy , First, accumulate them in GPU Medium one epoch After the end of the transmission back together CPU It's better than every one mini-batch All once GPU To CPU The transmission is faster .
Use semi precision floating point numbers half() There will be a certain speed increase , Specific efficiency depends on GPU model . It is necessary to be careful of the stability problems caused by low numerical accuracy .
Use... Often assert tensor.size() == (N, D, H, W) As a means of debugging , Make sure that the tensor dimension is consistent with what you envision .
Except for tags y Outside , Try to use less one-dimensional tensor , Use n*1 Instead of , It can avoid some unexpected calculation results of one-dimensional tensor .
It takes time to count all parts of the code
with torch.autograd.profiler.profile(enabled=True, use_cuda=False) as profile:
...
print(profile)
Or run... On the command line
python -m torch.utils.bottleneck main.py
边栏推荐
- 【mysql】触发器
- 'class file has wrong version 52.0, should be 50.0' - class file has wrong version 52.0, should be 50.0
- 什么是RDB和AOF
- HMS Core 机器学习服务打造同传翻译新“声”态,AI让国际交流更顺畅
- Select data Column subset in table R [duplicate] - select subset of columns in data table R [duplicate]
- Yyds dry goods count re comb this of arrow function
- 【mysql】游标的基本使用
- LLVM之父Chris Lattner:为什么我们要重建AI基础设施软件
- ICML 2022 | Flowformer: 任务通用的线性复杂度Transformer
- 愛可可AI前沿推介(7.6)
猜你喜欢
Study notes of grain Mall - phase I: Project Introduction
968 edit distance
Reference frame generation based on deep learning
##无yum源安装spug监控
【滑动窗口】第九届蓝桥杯省赛B组:日志统计
OneNote 深度评测:使用资源、插件、模版
硬件开发笔记(十): 硬件开发基本流程,制作一个USB转RS232的模块(九):创建CH340G/MAX232封装库sop-16并关联原理图元器件
Laravel notes - add the function of locking accounts after 5 login failures in user-defined login (improve system security)
[sliding window] group B of the 9th Landbridge cup provincial tournament: log statistics
HMS core machine learning service creates a new "sound" state of simultaneous interpreting translation, and AI makes international exchanges smoother
随机推荐
@PathVariable
[redis design and implementation] part I: summary of redis data structure and objects
Nodejs教程之让我们用 typescript 创建你的第一个 expressjs 应用程序
15million employees are easy to manage, and the cloud native database gaussdb makes HR office more efficient
JS操作dom元素(一)——获取DOM节点的六种方式
Is this the feeling of being spoiled by bytes?
Chris LATTNER, the father of llvm: why should we rebuild AI infrastructure software
基于深度学习的参考帧生成
[in depth learning] pytorch 1.12 was released, officially supporting Apple M1 chip GPU acceleration and repairing many bugs
It's almost the new year, and my heart is lazy
None of the strongest kings in the monitoring industry!
Laravel笔记-自定义登录中新增登录5次失败锁账户功能(提高系统安全性)
Thinking about agile development
2017 8th Blue Bridge Cup group a provincial tournament
Yyds dry goods count re comb this of arrow function
过程化sql在定义变量上与c语言中的变量定义有什么区别
数据湖(八):Iceberg数据存储格式
The use method of string is startwith () - start with XX, endswith () - end with XX, trim () - delete spaces at both ends
039. (2.8) thoughts in the ward
硬件开发笔记(十): 硬件开发基本流程,制作一个USB转RS232的模块(九):创建CH340G/MAX232封装库sop-16并关联原理图元器件