Training with Data Augmentation

Deep models for image classification and object recognition are often not robust enough for production. In practice they can easily be fooled, on purpose (e.g. via adversarial samples), or not (e.g. by noisy user generated content).

DeepDetect supports strong data augmentation for its Caffe backend and training of images.

User-Generated Content (UGC)

In the real-world, user-generated content such as images from mobile phones, can be of variable quality. A deep neural network can be taught to handle such noise at training time. A common way is to add noise to the images while training. DeepDetect supports a wide range of on-the-fly transforms to accomodate UGC. Data augmentation is a common commodity to deep learning tools, some of which are detailed below.

Augmented datasets

Deep learning generalizes well from large masses of data. Data augmentation is especially useful in the situations below:

  • Small dataset: can be artifially augmented through data augmentation and data generation. DeepDetect supports on-the-fly data augmentation.

  • User Generated Content: data augmentation injects a variety of noise and distortions at training time in order to strengthen the model.

DeepDetect supports the following transforms: crop, rotation, decolorization, histogram equalization, inverse image, Gaussian blur, posterization, erosion, salt & pepper, contrast limited adaptive histogram equalization, conversion to HSV and LAB, brightness distortion, contrast distortion, saturation, HUE distortion, random permutations, perspective transforms, horizontal, vertical, zoom.

Training robust models with DeepDetect

Training a robust image model is made easy via the API. Controls are for noise and distort objects of the mllib object. Both noise and distort can be used independently or in common.

The general control is through the selection of effects to be applied and the probability of their occurence:

  • Effect selection: use all_effects:true to select all effects, or selected effects, e.g. decolorize:true, brightness: true.

  • Probability of occurence: a single prob parameter controls the occurence of transforms within the noise or distort objects. I.e. the effects occurence cannot be controled individually. This also means that the value of prob needs to be set according to the number of activated effects. Typically, for noise, with all_effects:true, a good value is prob:0.01: since they are 10 transforms in the noise category, there are 10 sampling steps per image, yielding a transformation every 10 images.

Note that training will of course take longer with data augmentation.

Noise augmentation

Examples of settings for noise augmentation:

  • all effects

"mllib":{"noise":{"all_effects":true, "prob":0.01}}
  • selected effects

"mllib":{"noise":{"decolorize":true, "saltpepper":true, "prob":0.1}}


  • all effects

"mllib":{"distort":{"all_effects":true, "prob":0.01}}
  • selected effects

"mllib":{"distort":{"brightness":true, "contrast":true, "prob":0.1}}

Perspective transforms

  • all effects

  • selected effects

"mllib": {"geometry": {"all_effects":false, "persp_horizontal":true, "persp_vertical":false, "zoom_in":true, "zoom_out":true", "persp_factor":0.2, "zoom_factor":0.1, "pad_mode":"mirrored", "prob":0.1}}