Image Recognition with Neural Networks (part 1)

This article is the first of 2-part series explaining how to build an image recognition neural network. The code below is available also as a Google CoLaboratory interactive notebook. I will pass the link at the very end of part 2, so that you can run the code on your own. I also recommend reading my article on Machine Learning first, so you are familiar with things like TPUs and Interactive Notebooks.

DISCALMER: Do not rush through this article, take your time to understand the libraries and the code itself. This is not simple stuff we are doing here even though we will be using data set from the hello world of classification. At the very end you will get to execute all the code you see here. Do not be impatient, build your knowledge around this codes context so that when you do execute it – you actually know what it does.

Building a simple neural network

So, without further ado I would like to demonstrate to you using Cloud TPUs in GoogleColab to build a simple neural network classification model using iris data set to predict the species of the flower. This model is using 4 input features (SepalLength, SepalWidth, PetalLength, PetalWidth) to determine one of the following flower species: Setosa, Versicolor, Virginica. We could do something more exciting like classifying health condition based on some smart body scan or combining similar commercial properties in different categories for the real estate listings. But for now we will analyse this hello world of classification so you can grasp the basic idea and get to use some of the tools involved in the process.

We will start by importing all the required libraries. It would be beneficial to you to at least quickly google libraries that we will be importing shortly. That way you’ll have more familiarity with the code I will be explaining to you. Here are some good tutorials for two most often used libraries that I definitely recommend you familiarize yourself with – numpy and pandas.

Importing TensorFlow

Once we’ve loaded the libraries we run a simple check on TensorFlow just to make sure it imported ok.

import json import os 
import pandas as pd
import pprint
import tensorflow as tf
import time
import numpy as np
from tensorflow import keras
from keras import metrics
from IPython.display import Image
from keras.utils import plot_model
import matplotlib.pyplot as plt
print(tf.__version__) #the simple check

Next, we need to check for the artificial environment variable ‘COLAB_TPU_ADDR’. Its presence will indicate that we have a TPU resource available for us. If this fails, just go to “Edit” menu on top of the notebook and select “Notebook settings”. Once there, select TPU as our hardware accelerator, so that the cloud machine Google made available for this session will reconfigure to use TPU. We will also start a quick TensorFlow session just so that we can check what devices are available for our computations on the allocated machine. Format of output will show the name of the device, type of device CPU/GPU/TPU and finally the amount of memory allocated for each device.

use_tpu = True #@param {type:"boolean"}
if use_tpu:
assert 'COLAB_TPU_ADDR' in os.environ, 'Missing TPU; did you request a TPU in Notebook Settings?'
if 'COLAB_TPU_ADDR' in os.environ:
TF_MASTER = 'grpc://{}'.format(os.environ['COLAB_TPU_ADDR'])
with tf.Session(TF_MASTER) as session:
print ('List of devices:')
devices = session.list_devices()
for d in devices:
print(f'{} | type:{d.device_type} | memory:{(d.memory_limit_bytes/1048576):.0f} MB')

Below we define parameters required later by the network training and evaluation phases:

# TPU address tpu_address = TF_MASTER # [TRAINING PHASE] Number of epochs - Epochs are basically training cycles, # on each cycle network will try to improve its accuracy. # Concept of epoch is very helpful - you can batch them and # experiment with their number to see what works for # particular datasets and models epochs = 50 # [TRAINING PHASE] Number of steps_per_epoch - Number of times we want our # full training dataset # to be passed through the model on each epoch. Given 50 epochs and # 20 steps_per_epoch on a training dataset consisting of 121 samples we will # end up passing 121000 data rows (samples) through our network. steps_per_epoch = 20 # [EVALUATION PHASE] Total number of evaluation steps. If '0', # evaluation after training is skipped. We will use this # parameter in the network evaluation function, it means the # verification data will be feed to the network in 50 steps each containing # data rows in whatever amount we pass as batch_size - we are not specifying batch size # so the default batch size of 32 will be used. So our evaluation session # will consist of 50 data passes through the model each consisting of 32 rows # taken from the training dataset eval_steps = 50

Next, we specify information about our training and testing data, including URLs under which we can find CSV files. We also create an array of column names. This will be used by Panda library which we will use to load CSV. This column names array will be used to create an annotation on the Panda DataFrame object that will be created from the CSV file. Annotations are purely informational, you can but don’t have to pass them. More on this in the next step.

TRAIN_DATA_URL = "" TEST_DATA_URL = "" CSV_COLUMN_NAMES = ['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species'] # This is 3 types of species we will teach our model to predict (or to be more # precise: classify input data as one of these) SPICIES_UNDERSTOOD_BY_CLASSIFIER = ['Setosa', 'Versicolor', 'Virginica']

Pandas helpers

Following function uses Pandas helpers to create 2 sets of data. First set is our training data. That way the information we will pass to our model for it to learn how to predict 3 classes of Iris flowers. Second set is basically a smaller version of the first one but containing different values. The split between training and testing data is usually 80% – 20%. Each set describes parameters of flowers in subset-x and the corresponding classes of flowers in subset-y. Training process will basically try to iterate over our network (which we will configure very soon) passing it TRAIN data and try to smart-guess best weights for neurons in each layer.

The goal of this exercise is to allow our network/model to predict test set classes of given Iris flowers as accurate as possible. So after each iteration it will test its accuracy against the TEST data and when its predictions are in acceptable range it stops, giving us a model that in theory now should be able to predict classes of Iris flowers for any input parameters from outside of the test / train data sets (How cool is this?).

Also we use read_csv provided by Pandas to create DataFrames. They are basically annotated arrays with a lot of metadata and helper functions. It’s data scientists’ preferred way of loading csv files to memory and here you can read more about the reason for this.

# as you can remember Y set is the expected prediction results for X which # is its input parameters - so y_name is just a name of column in dataset representing # iris flower class, other 4 columns will represent each flowers parameters def load_test_and_train_data(y_name='Species'): # get_file downloads the file to a local data directory # and returns path to the downloaded file train_path = tf.keras.utils.get_file(TRAIN_DATA_URL.split('/')[-1], TRAIN_DATA_URL) test_path = tf.keras.utils.get_file(TEST_DATA_URL.split('/')[-1], TEST_DATA_URL) # Pandas read_csv creates a panda DataFrame from the csv file # train = pd.read_csv(train_path, names=CSV_COLUMN_NAMES, header=0, dtype={'SepalLength':, 'SepalWidth':, 'PetalLength':, 'PetalWidth':, 'Species':}) # here we remove 'Species' columns data from train dataset and turn it into # separate array: train_y train_x, train_y = train, train.pop(y_name) # this is also a panda DataFrame structure test = pd.read_csv(test_path, names=CSV_COLUMN_NAMES, header=0, dtype={'SepalLength':, 'SepalWidth':, 'PetalLength':, 'PetalWidth':, 'Species':}) # again just popping Y column but this time from the test values dataset test_x, test_y = test, test.pop(y_name) # So again - X is inputs of the model and Y is outputs, later when we are done # training our model with TrainXY and validating it with TestXY we should be able to # pass only X (from outside of these sets) and the model will predict Y on its own! return (train_x, train_y), (test_x, test_y)

Now we are ready to do the network configuration, this is where we shape its layers, inputs and outputs, a very important step that you are likely to experiment with the most – trying and comparing different configurations. Bellow we configure Sequential model/network. Its good for most deep learning problems. But you should also be aware of a Functional model. It’s slightly more complex and error prone but allows more flexibility. Both sequential and functional models have layers that we need to configure and each layer has its parameters.

As you can see bellow we are using .Dense() function which basically defines type of layer being added. Types usually differ in the number of inputs and outputs as well as the way they are connected. Dense (also called fully connected) means inputs and outputs are fully connected between layers. All neurons from layer n are connected with all layers in the following layer n+1. At this point it’s not important that you know convolutional layers (you would add them with the .conv(…) method). But if you are interested, you can see the comparison of Dense and Convolutional layer bellow. Convolutional Layer is often called a filter and is well suited for image processing. More on this here

def get_model():return keras.Sequential([# input layer, only one that you have to specify input_shape for.# Input Shape will usually match number of parameters that will come as an input# (here: 4 parameters of Iris flower) - this shape is basically becoming# a separate/additional input layer on its own. Activation function is how layer# processes inputs to outputs - ReLU (rectified linear unit) is a simple# often used activation function - it is linear (identity) for all positive# values, and zero for all negative values. This means ReLU is cheap to# execute and a lot of the time is a good starting point for neural networkskeras.layers.Dense(10, input_shape=(4,), activation=tf.nn.relu, name = "Dense_1"),# another dense layer with 10 neurons just like the one before and this one# also uses ReLU as its activation functionkeras.layers.Dense(10, activation=tf.nn.relu, name = "Dense_2"),# this layer is used to condense raw output of previous layer into 3 neurons# (we often do this before softmaxing - bellow) some more information on this# can be found here, activation=None, name = "logits"),# previous layer had a specific number of neurons because there are 3# Classes of Iris that this network is supposed to recognize - we do however# need to softmax it now, softmax will basically normalize all 3 values so# that summed they will add to 1 (100%) - think about it as a probability# indicator which could be:# Neuron 1 (Class of iris - 'Setosa') = 0.05 so 5%# Neuron 2 (Class of iris - 'Versicolor') = 0.05 so 5%# Neuron 3 (Class of iris - 'Virginica') = 0.90 so 90%# this would indicate with strong probability that whatever input# parameters ('SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth',# 'Species') we have passed as an input, represent 'Virginica' Iryskeras.layers.Dense(3, activation=tf.nn.softmax, name = "softmax")# Btw: name parameter is just a label, does not influence any processing])

Final word on ReLU

ReLU is basically a function which eliminates negative values. You can imagine if the previous layer provided some negative values to the following one. Some neurons may not activate due to the 0 value set by their ReLU activation function. ReLU and a Sigmoid are two very common activation functions that are well suited for a variety of cases. I will tell you about in the future articles.

Keeping track of our progress

We won’t learn anything more in this already lengthy article. But we’ve made a lot of progress. We’ve initialized our environment and made all the preparations of the test data. We’ve also configured the network model itself. In the next article we’ll compile the model and deal with the training, evaluation and the predictions themselves. Now I strongly suggest going through the code and dissecting it in mind trying to understand what is happening here. Try writing down questions about anything that is unclear. You can either look for it in part 2 of this series. Or google it like I did.

Also feel free to leave any thoughts or/and questions in the comments. I will try to help as much as I can.

Tags: ,