Training an image classifier

Image classification is the task to attach a single label to an image, e.g. to determine whether an image is a cat or a dog.

Data format

The platform has the following requirements for training from images for single label classification:

  • All data must be in image format, most encoding supported (e.g. png, jpg, …)
  • Use one directory for each class

  • Every directory contains an image per training or testing samples for that class

Data format for image classification:

mydata/train/
mydata/train/dogs/
mydata/train/dogs/dog.1223.jpg
mydata/train/dogs/dog.2124.jpg
mydata/train/cats/cat.314.jpg
mydata/train/cats/cat.3124.jpg

mydata/test/dogs/
mydata/test/dogs/dog.3333.jpg
mydata/test/cats/
mydata/test/cats/cat.1123.jpg

DD platform comes with a custom Jupyter UI that allows testing your dataset prior to starting the training:

Image classification data check in DD platform Jupyter UI

Training an image classifier

Using the DD platform, from a JupyterLab notebook, start from the code on the right.

Image classification notebook snippet:

img_classif = Classification(
  'dogs_cats',
  training_repo='/opt/platform/examples/dogs_cats/train/',
  tsplit=.2,
  host='deepdetect_training',
  port=8080,
  model_repo='/opt/platform/models/training/JeanDupont/dogs_cats',
  template='se_resnet_50',
  img_width=224,
  img_height=224,
  mirror=True,
  rotate=False,
  base_lr=0.001,
  solver_type="AMSGRAD",
  finetune=True,
  weights='/opt/platform/models/pretrained/se_resnet_50/SE-ResNet-50.caffemodel',
  iterations=2500,
  test_interval=500,
  snapshot_interval=5000,
  batch_size=16,
  iter_size=2,
  nclasses=2,
  test_batch_size=4,
  noise_prob=0.001,
  distort_prob=0.001,
  gpuid=0
)
img_classif

This prepares for training an images classifier with the following parameters:

  • dogs_cats is the example job name
  • training_repo specifies the location of the data
  • template speficies a Squeeze-Excitation ResNet-50 that has excellent capabilities

  • mirror activates mirroring of inputs as data augmentation

  • noise_prob and distort_prob activate random transforms for data augmentation, see https://deepdetect.com/tutorials/data-augmentation/ for more details

  • finetune automatically prepares the network architecture for finetuning

  • weights specifies the pre-trained model weights to start training from

  • solver_type specifies the optimizer, see https://deepdetect.com/api/#launch-a-training-job and solver_type for the many options

  • base_lr specifies the learning rate.

  • gpuid specifies which GPU to use, starting with number 0

You should get around 99% accuracy on the example cats & dogs dataset with the configuration above.

The platform has many neural network architectures and pre-trained models built-in for image classification. These range from state of the art architectures like ResNets and DenseNets to low-memory Squeezenet, Shufflenet and Mobilenet.

Below is a list of recommended models for image classification from which to best choose for your task.

Model Template Pre-Trained (/opt/platform/models/pretrained) Recommendation
GoogleNet googlenet googlenet/bvlc_googlenet.caffemodel Very Fast / Good accuracy / embedded & desktops
SE-ResNet-50 se_resnet_50 se_resnet_50/SE-ResNet-50.caffemodel Fast / Excellent accuracy / desktops
SqueezeNet squeezenet squeezenet/squeezenet_v1.1.caffemodel Extremely Fast / Good accuracy / embedded

For a full list of available templates and models, see https://github.com/jolibrain/deepdetect/blob/master/README.md#models

For a review of performances on desktops and embedded devices, see https://github.com/jolibrain/dd_performances

Related