Classify Images as Cat or Dog
By: Roy
Date: 06-Dec-2021
In this guide, I’ll walk you through the steps to create a machine learning model that can classify images as either a cat, a dog, or neither. We’ll also build an application that uses this model to make predictions.
Prerequisites
Before we begin, ensure you have the following:
- Caffe 1.0 Installed: This includes the necessary Python interfaces. If you need help with installation, refer to
Installing Caffe prerequisite_03122020.md
. - Understanding of ANN and CNN: A basic understanding of Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN) is required. You can refer to
ANN and CNN.pdf
for more information. - Knowledge of Caffe Prototxt Files: Familiarize yourself with Caffe prototxt files by reading
Understanding Caffe.pdf
.
Step 1: Download the Application and Scripts
Download the necessary scripts and application from this link. You will also need the dataset for training, which you can download here.
Once downloaded, extract the dataset into a folder.
Step 2: Create LMDB
Next, create an LMDB (Lightning Memory-Mapped Database) from the downloaded images. We will create two LMDB datasets:
- Training Set: 20,000 images
- Validation Set: 4,100 images
How to Create LMDB
- Edit the script
/caffe_example_catsAnddogs/scripts/create_lmdb.py
and update the paths:- Set the path where you want to save
train_lmdb
andvalidation_lmdb
. - Provide the path of the downloaded dataset in
train_data
. - Run the script:
- Set the path where you want to save
python create_lmdb.py
This will generate train_lmdb
and validation_lmdb
in the specified locations, with images labeled as 0 for cats and 1 for dogs.
Note:
-
- Learn more about LMDB here.
- To install LMDB, use:
3. Python Path Error Workaround
If you encounter an error indicating that the Caffe Python interface is not accessible, try this workaround:
- Copy the script: cp <yourpath>/caffe_example_catsAnddogs/scripts/create_lmdb.py <caffe src and install path>/python
- Run the script from the Caffe Python directory:
cd <caffe src and install path>/python python create_lmdb.py
Step 3: Create a Mean Image
A mean image is an image where each pixel is an average of the corresponding pixels from all images. To create this mean image, use the following Caffe command:
<caffe install dir>/build/tools/compute_image_mean -backend=lmdb <path of train_lmdb>/train_lmdb/ <path to save mean image>/mean.binaryproto
This command will save the mean image as a binaryproto file.
Step 4: Edit Prototxt Files
Editing caffenet_train_val_1.prototxt
This file, located at caffe_example_catsAnddogs/models
, needs the following updates:
- Update the paths for
train_lmdb
,validation_lmdb
, and the mean image in the input layer.
Additionally, the loss function used is softmaxwithloss
. However, you can experiment with cross-entropy loss.
Editing solver_1.prototxt
Edit the file located at caffe_example_catsAnddogs/models
to update the paths. Important parameters to note:
- Test_iter: Number of iterations before validating on the validation dataset.
- Base_lr: Initial learning rate. Adjust carefully to avoid overshooting or undershooting in gradient descent.
- Max_iter: Total number of iterations before stopping training.
- Snapshot: Number of iterations after which the model is saved as an intermediate checkpoint.
Step 5: Training the Model
To train the model, use the following Caffe command:
<path to caffe install dir>/build/tools/caffe train -solver <your path>/caffe_example_catsAnddogs/models/solver_1.prototxt 2>&1 | tee ./model_1_train.log
Logs will be saved in model_1_train.log
. During training, you should observe the loss decreasing and accuracy increasing. If not, review the parameters.
Note:
- After 3,000 iterations, the loss should be around 0.2XX, and accuracy around 0.7.
- Once training completes, the trained model will be saved as a
.caffemodel
file.
Step 6: Testing the Final Application
The final application, written in C++, uses the Caffe Classifier class to predict whether an image is of a cat or dog. The source code is located at caffe_example_catsAnddogs/demo.cpp
.
Editing caffenet_deploy_1.prototxt
This file, used for inference, is similar to caffenet_train_val_1.prototxt
but with changes in the input layer to accept real image input. The final layer is a Softmax
layer.
Final Steps
- Place the trained
.caffemodel
file in thecaffe_example_catsAnddogs/models/
directory.
cp <Your path where .caffemodel file was created>/caffe_model_1_iter_XXXX.caffemodel <yourpath>//home/roy/caffe_example_catsAnddogs/models
- Modify the name of the model file in
demo.cpp
if you trained for a different number of iterations. - Compile the application as per the instructions in the
README.md
file.
Running the Application
To classify an image:
./DEMO -i <image.jpg>
To classify from video feed:
./DEMO -l 0
- The
-l
option refers to the camera device number in the system.
Samples of JPEG images are provided in the folder. For more details on compiling and running the application, refer to the README.md
file.
This concludes the step-by-step guide to building and testing a cat vs. dog image classifier using Caffe. Happy coding!