Development and deployment of image classifier application with fastai

author |[email protected]
compile |Flin
source |analyticsvidhya

Introduce

Fastai Is a popular open source library , For learning and practicing machine learning and deep learning . Jeremy · Howard （Jeremy Howard） And Rachel · Thomas （Rachel Thomas） founded fast.ai, The goal is to make deep learning resources more accessible .fast.ai All the detailed resources provided in , For example, courses , Software and research papers are completely free .

2020 year 8 month ,fastai_v2 Release , This version is expected to be faster , More flexible implementation of deep learning framework . stay 2020 fastai The course combines the core concepts of machine learning and deep learning . It also introduces users to important aspects of model production and deployment .

In this paper , I will discuss fast.ai In the first three lessons of the beginner's course, we introduced the technology of building a fast and simple image classification model . While building the model , You will also learn how to easily develop models Web Application and deploy it to the production environment .

This article will follow Jeremy The top-down approach used in its curriculum . You will first learn about training image classifiers . later , Details about the model used for classification will be explained . To understand this article , You have to have Python knowledge , because fastai Yes, it is Python Written and based on PyTorch Built . I suggest that you Google Colab or Gradient Run this code in , Because we need GPU Access right , and fastai It can be easily installed on both platforms .

install , Import and load datasets

!pip install -Uqq fastbook

import fastbook
fastbook.setup_book()

from fastbook import *
from fastai.vision.widgets import *

install fastai And import the necessary Libraries . If you're using Colab, You have to provide information about Google Access to the cloud hard disk to save files and images . You can start your Kaggle and Bing Download any image dataset from sources such as image search .Fast.ai There's also a huge collection of images . I used in this article from https://github.com/ieee8023/c... A set of breasts X Radiograph .

path = Path ('/content/gdrive/My Drive/Covid19images')

Save the path to the location of the dataset in Path() In the object . If you use fast.ai Data sets , You can use the following code ：

path = untar_data(URLs.PETS)/'images'

This will come from fastai PETS Download and extract images from dataset collection .

Check the image path and display some sample images in the dataset . I've used Python Imaging Library（PIL）.

path.ls
from PIL import Image
img = Image.open(path'/train/covid/1-s2.0-S1684118220300682-main.pdf-002-a2.png')
print(img.shape)
img.to_thumb(128,128)

In this image classification problem , I'm going to train the model , In order to X X-ray images are classified as COVID or No COVID class . The preprocessing dataset has been placed in a separate COVID and No COVID In the folder （ source ：ChristianTutivénGálvez）.

If you're using fast.ai Data sets , Please use the following function to group images by pet's name ：

def is_cat(x): return x[0].isupper()

PETS It's a collection of cat and dog images .Cat The picture is marked with the first capital letter , So it's easy to classify them .

Image transformation

Image transformation is the key step in training image model . It's also known as data expansion . In order to avoid over fitting the model , Image transformation is necessary . There are many ways to convert images , For example, sizing , tailoring , Compress and fill . however , Compression and filling will grab the original information in the image , And add other pixels separately . therefore , Random image resizing can produce good results .

This is shown in the following example , In this method , Random regions of each image are sampled in each period . This allows the model to learn more details about each image , In order to achieve higher accuracy .

Another point to remember is , Always transform only the training image , Without modifying the validation image . stay fastai In the library , This is handled by default .

item_tfms=Resize(128, ResizeMethod.Squish))
item_tfms=Resize(128, ResizeMethod.Pad, pad_mode='zeros')
item_tfms=RandomResizedCrop(128, min_scale=0.3) - 30% of the image area is zoomed by specifying 0.3

Fastai Kuo passed through aug_transforms Function provides a set of standard extensions . If the image size is uniform , It can be applied in batches , Save a lot of training time .

tfms = aug_transforms(do_flip = True, flip_vert = False, mult=2.0)

fastai Medium DataLoaders Class is very easy to store various objects for training and validating models . If you want to customize the objects to be used during training , Then you can put DataBlock Class and DataLoaders Use a combination of .

data= ImageDataLoaders.from_folder(path,train = "train", valid_pct=0.2, item_tfms=Resize(128), batch_tfms=tfms, bs = 30, num_workers = 4)

If you define an image label in a metafile , You can use DataBlock Divide the image and label into two different blocks , This is shown in the following code snippet . Use defined data blocks with data loader functions to access images .

Data = DataBlock( blocks=(ImageBlock, CategoryBlock), get_items=get_image_files, 
splitter=RandomSplitter(valid_pct=0.2, seed=42), get_y=parent_label, item_tfms=Resize(128))
dls = Data.dataloaders(path)

model training

To train the image dataset , Using pre trained CNN Model . This method is called transfer learning . Jeremy （Jeremy） It is recommended to use pre trained models , To speed up training and improve accuracy . This is especially true for computer vision problems .

learn = cnn_learner(data, resnet34, metrics=error_rate)
learn.fine_tune(4)

Use ResNet34 Architecture , And verify the result according to the error rate . Due to the use of pre trained models for training , So use fine tuning instead of fitting the model .

You can run more periods , And look at the performance of the model . Choose the right number of periods to avoid over fitting .

You can try to use accuracy （ accuracy = 1- Error rate ） To verify model performance , Instead of using error_rate. Both are used to validate the output of the model . In this example , Retain the 20％ Data used to verify . therefore , The model will only be applied to 80％ Data for training . This is a very critical step in checking the performance of any machine learning model . You can also change ResNet layer （ The options are 18、50、101 and 152） To run this model . Unless you have a large dataset that will produce accurate results , Otherwise, this may lead to over fitting again .

Verify model performance

Model performance can be verified in different ways . One popular method is to use obfuscation matrix . The diagonal value of the matrix indicates the correct prediction for each category , Other cell values indicate many false predictions .

interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()

Fastai Provides a useful feature , You can view false predictions based on the highest loss rate . The output of this function indicates the prediction tag of each image , Target tag , Loss rate and probability value . High probability means that the model has a high degree of confidence . It's in 0 To 1 Change between . High loss rate indicates how poor the model performance is .

interp.plot_top_losses(5, nrows=1, figsize = (25,5))

Another great Fastai function , ImageClassifierCleaner（GUI）, It can clear the fault image by deleting the fault image or renaming its label . This is very helpful for data preprocessing , Thus, the accuracy of the model is improved .

Jeremy （Jeremy） It is recommended to run this function after basic training of images , Because this allows you to understand the types of exceptions in the dataset .

from fastai.vision.widgets import *
cleaner = ImageClassifierCleaner(learn)
cleaner

Save and deploy the model

After training the model and being satisfied with the results , You can deploy the model . To deploy the model to a production environment , You need to save the architecture of the model and the parameters for training it . So , Using the export method . The exported model is saved as PKL file , The document is pickle（Python modular ） Created files .

learn.export()

Create an inference learner from the exported file , The learner can be used to deploy the model as an application . Inference learners predict the output of a new image at a time . Prediction returns three parameters ： Forecast category , The index of the predicted categories and the probability of each category .

learn_inf = load_learner(path/'export.pkl')
learn_inf.predict("img")

(‘noCovid’, tensor(1), tensor([5.4443e-05, 9.9995e-01])) – prediction

There are many ways to create Web Applications . One of the easiest ways is to use as GUI Component's IPython The widget is in Jupyter notebook Create the required objects for your application in .

from fastai.vision.widgets import *
btn_upload = widgets.FileUpload()
out_pl = widgets.Output()
lbl_pred = widgets.Label()

After designing the application elements , Please use the image Web Applications run the same way Jupyter notebook Of Voila To deploy the model . It removes all cell input , Show only model output . To put notebook As VoilàWeb Application view , Please put the browser URL Medium “notebook” Replace the word with “ voila/render”. Must include trained models and IPython The identity of the widget notebook Install and execute Voila.

!pip install voila
!jupyter serverextension enable voila --sys-prefix

Conclusion

That's it , You've used fastai The library builds and deploys a cool image classifier application , It's just eight steps ！ This is just the tip of the iceberg that I've shown in this article . There are more fastai Components can be used with NLP Various deep learning use cases related to computer vision , You can explore these components .

Here are fastai Learning resources , And mine git repo, It contains the code and image of the image classifier explained in this paper .

Covid19 X X-ray image classifier ： Contains the complete code and data set discussed in this article
- https://github.com/RajiRai/Fa...
covers fast.ai All the courses taught in the course
- https://github.com/fastai/fas...
Covering the whole of fastai API file
- https://docs.fast.ai/
Fast.ai Community BBS
- https://forums.fast.ai/