author |[email protected]
compile |Flin
source |analyticsvidhya
Introduce
Fastai Is a popular open source library , For learning and practicing machine learning and deep learning . Jeremy · Howard (Jeremy Howard) And Rachel · Thomas (Rachel Thomas) founded fast.ai, The goal is to make deep learning resources more accessible .fast.ai All the detailed resources provided in , For example, courses , Software and research papers are completely free .
2020 year 8 month ,fastai_v2 Release , This version is expected to be faster , More flexible implementation of deep learning framework . stay 2020 fastai The course combines the core concepts of machine learning and deep learning . It also introduces users to important aspects of model production and deployment .
In this paper , I will discuss fast.ai In the first three lessons of the beginner's course, we introduced the technology of building a fast and simple image classification model . While building the model , You will also learn how to easily develop models Web Application and deploy it to the production environment .
This article will follow Jeremy The top-down approach used in its curriculum . You will first learn about training image classifiers . later , Details about the model used for classification will be explained . To understand this article , You have to have Python knowledge , because fastai Yes, it is Python Written and based on PyTorch Built . I suggest that you Google Colab or Gradient Run this code in , Because we need GPU Access right , and fastai It can be easily installed on both platforms .
install , Import and load datasets
!pip install -Uqq fastbook
import fastbook
fastbook.setup_book()
from fastbook import *
from fastai.vision.widgets import *
install fastai And import the necessary Libraries . If you're using Colab, You have to provide information about Google Access to the cloud hard disk to save files and images . You can start your Kaggle and Bing Download any image dataset from sources such as image search .Fast.ai There's also a huge collection of images . I used in this article from https://github.com/ieee8023/c... A set of breasts X Radiograph .
path = Path ('/content/gdrive/My Drive/Covid19images')
Save the path to the location of the dataset in Path() In the object . If you use fast.ai Data sets , You can use the following code :
path = untar_data(URLs.PETS)/'images'
This will come from fastai PETS Download and extract images from dataset collection .
Check the image path and display some sample images in the dataset . I've used Python Imaging Library(PIL).
path.ls
from PIL import Image
img = Image.open(path'/train/covid/1-s2.0-S1684118220300682-main.pdf-002-a2.png')
print(img.shape)
img.to_thumb(128,128)
In this image classification problem , I'm going to train the model , In order to X X-ray images are classified as COVID or No COVID class . The preprocessing dataset has been placed in a separate COVID and No COVID In the folder ( source :ChristianTutivénGálvez).
If you're using fast.ai Data sets , Please use the following function to group images by pet's name :
def is_cat(x): return x[0].isupper()
PETS It's a collection of cat and dog images .Cat The picture is marked with the first capital letter , So it's easy to classify them .
Image transformation
Image transformation is the key step in training image model . It's also known as data expansion . In order to avoid over fitting the model , Image transformation is necessary . There are many ways to convert images , For example, sizing , tailoring , Compress and fill . however , Compression and filling will grab the original information in the image , And add other pixels separately . therefore , Random image resizing can produce good results .
This is shown in the following example , In this method , Random regions of each image are sampled in each period . This allows the model to learn more details about each image , In order to achieve higher accuracy .
Another point to remember is , Always transform only the training image , Without modifying the validation image . stay fastai In the library , This is handled by default .
item_tfms=Resize(128, ResizeMethod.Squish))
item_tfms=Resize(128, ResizeMethod.Pad, pad_mode='zeros')
item_tfms=RandomResizedCrop(128, min_scale=0.3) - 30% of the image area is zoomed by specifying 0.3
Fastai Kuo passed through aug_transforms Function provides a set of standard extensions . If the image size is uniform , It can be applied in batches , Save a lot of training time .
tfms = aug_transforms(do_flip = True, flip_vert = False, mult=2.0)
fastai Medium DataLoaders Class is very easy to store various objects for training and validating models . If you want to customize the objects to be used during training , Then you can put DataBlock Class and DataLoaders Use a combination of .
data= ImageDataLoaders.from_folder(path,train = "train", valid_pct=0.2, item_tfms=Resize(128), batch_tfms=tfms, bs = 30, num_workers = 4)
If you define an image label in a metafile , You can use DataBlock Divide the image and label into two different blocks , This is shown in the following code snippet . Use defined data blocks with data loader functions to access images .
Data = DataBlock( blocks=(ImageBlock, CategoryBlock), get_items=get_image_files,
splitter=RandomSplitter(valid_pct=0.2, seed=42), get_y=parent_label, item_tfms=Resize(128))
dls = Data.dataloaders(path)
model training
To train the image dataset , Using pre trained CNN Model . This method is called transfer learning . Jeremy (Jeremy) It is recommended to use pre trained models , To speed up training and improve accuracy . This is especially true for computer vision problems .
learn = cnn_learner(data, resnet34, metrics=error_rate)
learn.fine_tune(4)
Use ResNet34 Architecture , And verify the result according to the error rate . Due to the use of pre trained models for training , So use fine tuning instead of fitting the model .
You can run more periods , And look at the performance of the model . Choose the right number of periods to avoid over fitting .
You can try to use accuracy ( accuracy = 1- Error rate ) To verify model performance , Instead of using error_rate. Both are used to validate the output of the model . In this example , Retain the 20% Data used to verify . therefore , The model will only be applied to 80% Data for training . This is a very critical step in checking the performance of any machine learning model . You can also change ResNet layer ( The options are 18、50、101 and 152) To run this model . Unless you have a large dataset that will produce accurate results , Otherwise, this may lead to over fitting again .
Verify model performance
Model performance can be verified in different ways . One popular method is to use obfuscation matrix . The diagonal value of the matrix indicates the correct prediction for each category , Other cell values indicate many false predictions .
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()
Fastai Provides a useful feature , You can view false predictions based on the highest loss rate . The output of this function indicates the prediction tag of each image , Target tag , Loss rate and probability value . High probability means that the model has a high degree of confidence . It's in 0 To 1 Change between . High loss rate indicates how poor the model performance is .
interp.plot_top_losses(5, nrows=1, figsize = (25,5))
Another great Fastai function , ImageClassifierCleaner(GUI), It can clear the fault image by deleting the fault image or renaming its label . This is very helpful for data preprocessing , Thus, the accuracy of the model is improved .
Jeremy (Jeremy) It is recommended to run this function after basic training of images , Because this allows you to understand the types of exceptions in the dataset .
from fastai.vision.widgets import *
cleaner = ImageClassifierCleaner(learn)
cleaner
Save and deploy the model
After training the model and being satisfied with the results , You can deploy the model . To deploy the model to a production environment , You need to save the architecture of the model and the parameters for training it . So , Using the export method . The exported model is saved as PKL file , The document is pickle(Python modular ) Created files .
learn.export()
Create an inference learner from the exported file , The learner can be used to deploy the model as an application . Inference learners predict the output of a new image at a time . Prediction returns three parameters : Forecast category , The index of the predicted categories and the probability of each category .
learn_inf = load_learner(path/'export.pkl')
learn_inf.predict("img")
(‘noCovid’, tensor(1), tensor([5.4443e-05, 9.9995e-01])) – prediction
There are many ways to create Web Applications . One of the easiest ways is to use as GUI Component's IPython The widget is in Jupyter notebook Create the required objects for your application in .
from fastai.vision.widgets import *
btn_upload = widgets.FileUpload()
out_pl = widgets.Output()
lbl_pred = widgets.Label()
After designing the application elements , Please use the image Web Applications run the same way Jupyter notebook Of Voila To deploy the model . It removes all cell input , Show only model output . To put notebook As VoilàWeb Application view , Please put the browser URL Medium “notebook” Replace the word with “ voila/render”. Must include trained models and IPython The identity of the widget notebook Install and execute Voila.
!pip install voila
!jupyter serverextension enable voila --sys-prefix
Conclusion
That's it , You've used fastai The library builds and deploys a cool image classifier application , It's just eight steps ! This is just the tip of the iceberg that I've shown in this article . There are more fastai Components can be used with NLP Various deep learning use cases related to computer vision , You can explore these components .
Here are fastai Learning resources , And mine git repo, It contains the code and image of the image classifier explained in this paper .
Covid19 X X-ray image classifier : Contains the complete code and data set discussed in this article
covers fast.ai All the courses taught in the course
Covering the whole of fastai API file
Fast.ai Community BBS
Link to the original text :https://www.analyticsvidhya.c...
Welcome to join us AI Blog station :
http://panchuang.net/
sklearn Machine learning Chinese official documents :
http://sklearn123.com/
Welcome to pay attention to pan Chuang blog resource summary station :
http://docs.panchuang.net/