Classify Image as cat or Dog using ANN

Classify Images as Cat or Dog

By: Roy
Date: 06-Dec-2021

In this guide, I’ll walk you through the steps to create a machine learning model that can classify images as either a cat, a dog, or neither. We’ll also build an application that uses this model to make predictions.

Prerequisites

Before we begin, ensure you have the following:

  • Caffe 1.0 Installed: This includes the necessary Python interfaces. If you need help with installation, refer to Installing Caffe prerequisite_03122020.md.
  • Understanding of ANN and CNN: A basic understanding of Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN) is required. You can refer to ANN and CNN.pdf for more information.
  • Knowledge of Caffe Prototxt Files: Familiarize yourself with Caffe prototxt files by reading Understanding Caffe.pdf.

Step 1: Download the Application and Scripts

Download the necessary scripts and application from this link. You will also need the dataset for training, which you can download here.

Once downloaded, extract the dataset into a folder.

Step 2: Create LMDB

Next, create an LMDB (Lightning Memory-Mapped Database) from the downloaded images. We will create two LMDB datasets:

  • Training Set: 20,000 images
  • Validation Set: 4,100 images

How to Create LMDB

  1. Edit the script /caffe_example_catsAnddogs/scripts/create_lmdb.py and update the paths:
    • Set the path where you want to save train_lmdb and validation_lmdb.
    • Provide the path of the downloaded dataset in train_data.
    • Run the script:

python create_lmdb.py

This will generate train_lmdb and validation_lmdb in the specified locations, with images labeled as 0 for cats and 1 for dogs.

Note:

    • Learn more about LMDB here.
    • To install LMDB, use:
sudo apt-get install liblmdb-dev

3. Python Path Error Workaround

If you encounter an error indicating that the Caffe Python interface is not accessible, try this workaround:

  1. Copy the script: cp <yourpath>/caffe_example_catsAnddogs/scripts/create_lmdb.py <caffe src and install path>/python
  2. Run the script from the Caffe Python directory:

cd <caffe src and install path>/python python create_lmdb.py

Step 3: Create a Mean Image

A mean image is an image where each pixel is an average of the corresponding pixels from all images. To create this mean image, use the following Caffe command:

<caffe install dir>/build/tools/compute_image_mean -backend=lmdb <path of train_lmdb>/train_lmdb/ <path to save mean image>/mean.binaryproto

This command will save the mean image as a binaryproto file.

Step 4: Edit Prototxt Files

Editing caffenet_train_val_1.prototxt

This file, located at caffe_example_catsAnddogs/models, needs the following updates:

  • Update the paths for train_lmdb, validation_lmdb, and the mean image in the input layer.

Additionally, the loss function used is softmaxwithloss. However, you can experiment with cross-entropy loss.

Editing solver_1.prototxt

Edit the file located at caffe_example_catsAnddogs/models to update the paths. Important parameters to note:

  • Test_iter: Number of iterations before validating on the validation dataset.
  • Base_lr: Initial learning rate. Adjust carefully to avoid overshooting or undershooting in gradient descent.
  • Max_iter: Total number of iterations before stopping training.
  • Snapshot: Number of iterations after which the model is saved as an intermediate checkpoint.

Step 5: Training the Model

To train the model, use the following Caffe command:

<path to caffe install dir>/build/tools/caffe train -solver <your path>/caffe_example_catsAnddogs/models/solver_1.prototxt 2>&1 | tee ./model_1_train.log

Logs will be saved in model_1_train.log. During training, you should observe the loss decreasing and accuracy increasing. If not, review the parameters.

Note:

  • After 3,000 iterations, the loss should be around 0.2XX, and accuracy around 0.7.
  • Once training completes, the trained model will be saved as a .caffemodel file.

Step 6: Testing the Final Application

The final application, written in C++, uses the Caffe Classifier class to predict whether an image is of a cat or dog. The source code is located at caffe_example_catsAnddogs/demo.cpp.

Editing caffenet_deploy_1.prototxt

This file, used for inference, is similar to caffenet_train_val_1.prototxt but with changes in the input layer to accept real image input. The final layer is a Softmax layer.

Final Steps

  1. Place the trained .caffemodel file in the caffe_example_catsAnddogs/models/ directory.

cp <Your path where .caffemodel file was created>/caffe_model_1_iter_XXXX.caffemodel <yourpath>//home/roy/caffe_example_catsAnddogs/models

  1. Modify the name of the model file in demo.cpp if you trained for a different number of iterations.
  2. Compile the application as per the instructions in the README.md file.

Running the Application

To classify an image:

./DEMO -i <image.jpg>

To classify from video feed:

./DEMO -l 0

 

  • The -l option refers to the camera device number in the system.

Samples of JPEG images are provided in the folder. For more details on compiling and running the application, refer to the README.md file.


This concludes the step-by-step guide to building and testing a cat vs. dog image classifier using Caffe. Happy coding!