Classifying Nigerian Food Images Using FastAI

Onuoha Obinna
5 min readJun 26, 2019
Egusi Soup

Hi there!

So I took Jeremy Howard’s idea to gather a dataset and try out building a neural network classifier on it using the fastai library. Truthfully it has turned out to be a better way of understanding what is actually going on at every stage.

In this article I will highlight how I downloaded images of some Nigerian foods and built a classifier using a few lines of code made possible by the fastai library.

Getting the Images

I used the google_images_download library to send the image requests through google search and download the top 500 results. A sample Code for downloading images of egusi soup is shown below.

from google_images_download import google_images_download 
#importing the library
response = google_images_download.googleimagesdownload()
#class instantiation
arguments = {"keywords":"Egusi Soup","limit":1000,"print_urls":True,'chromedriver':'C:/Users/HP/Downloads/chromedriver_win32/chromedriver.exe'}
#creating list of arguments
paths = response.download(arguments)
#passing the arguments to the function
print(paths)
#printing absolute paths of the downloaded images

the google_images_download can also take a list of keywords to search for and download. So it was relatively easy to get the dataset from the internet, now tedious part was going through each of the dataset folders to wean out images that were not supposed to be in there.

Building the Classifier

I used google colab to gain access to a free GPU :) . Now let’s get right to it.

First Import all the needed modules.

import fastai
from fastai.vision import *

The dataset was uploaded to my google drive so it was necessary to mount my google drive on the current session so that I could have access to the datafolders.

from google.colab import drive
drive.mount('/content/drive')

Set the path to the directory containing the dataset

path='drive/My Drive/NigFoods dataset/train'

The next step is to augment the dataset by adjusting some image transform arguments that would work with the type of images contained in the dataset.

Data augmentation is perhaps the most important regularization technique when training a model for Computer Vision: instead of feeding the model with the same pictures every time, we do small random transformations (a bit of rotation, zoom, translation, etc…) that don’t change what’s inside the image (to the human eye) but do change its pixel values. Models trained with data augmentation will then generalize better. (docs.fastai.com)

transforms=get_transforms(do_flip=True,flip_vert=True,max_rotate=125,max_zoom=1.05,max_warp=0.1)

Create a databunch which will contain the training images, the validation images and the corresponding labels. This has been made relatively easy as it can be a simple one liner

Before any work can be done a dataset needs to be converted into a DataBunch object, and in the case of the computer vision data - specifically into an ImageDataBunch subclass. This is done with the help of data block API and the ImageList class and its subclasses. (docs.fastai.com)

np.random.seed(42)
data=ImageDataBunch.from_folder(path,valid_pct=0.2,train=".",ds_tfms =transforms,size=224,num_workers=4).normalize(imagenet_stats)

view the classes in the dataset.

view the a typical databunch training batch.

You can clearly see the effect of the transformations, so it doesn’t matter the angle the image was taken from, the model would tend to generalize a little bit better since it would have seen samples of images from different angles and lighting conditions.

Create the learner instance. A basic fastai learner object requires three arguments which are (the data, the model architecture, the evaluation metric)

The model architecture simply specifies things like the layer arrangements of the learner.

learn=cnn_learner(data,models.resnet34,
metrics=[error_rate,accuracy])

The next step is to go right ahead and train the model.

In 5 epochs which took roughly 13 minutes the model was able to achieve about 80% accuracy. Now can that improved? Definitely!

Firstly lets save this model.

learn.save(‘stage-1’)

then unfreeze the model and train some more. wait! right there? what does it mean to unfreeze a model?

Well, remember we used the resnet34 architecture to create our learner instance. During training, all that we did was update the weights of the very last layers whilst the initial layers were not affected. The unfreeze function allows the weights for all the layers of the resnet34 to be updated.

learn.unfreeze()

Find a good learning rate for the model.

learn.lr_find()
learn.recorder.plot()

A good learning rate would most likely occur right before the loss shoots right up.

So I picked a learning rate of (1e-05,1e-03)

Now train a little more using the learning rate

learn.fit_one_cycle(4,max_lr=slice(1e-5,1e-3))
After Unfreezing accuracy increased from 80% to 89%

Well lets just quickly look at the foods that are being classified wrongly the most.

report=ClassificationInterpretation.from_learner(learn)report.plot_top_losses(9, figsize=(15,11)) 

A take away from the plot of top losses would be that the top left image was correct predicted to be an image of rice and beans but unfortunately it was located in the noodles folder. Little findings like this can help us to evaluate both our model performance but also our data-set correctness.

What next?

We could always train some more since the accuracy hasn’t begun to drop yet. Also we can also get more data and make sure that the data we have is accurate?

I am a computer engineering graduate with interest in data science and this is my first medium post.

link to profile: onuohasilver.github.io

--

--