Setup of an object detector

This tutorial sets an image object detector that will distinguish among 21 objects. The detector returns a bounding box for every detected object, centered around it along with a label, e.g. person, car, … This tutorial uses a pre-trained deep neural net on the VOC task.

A few examples:

Bastille Pont Paris Us Plane Horses

The detecting service allows for an application to send images and to receive the set of bounding boxes per image in return, in JSON format.

The following pre-supposes that DeepDetect runs as a Docker container, see how to quickstart.

Setting up the pre-trained model


mkdir models

This prepares the model directory.

Setting up the detector service

Let’s start the DeepDetect server:


docker run -d -p 8080:8080 -v /path/to/models:/opt/models/ jolibrain/deepdetect_cpu

and create a service:


curl -X PUT "http://localhost:8080/services/ilsvrc_googlenet" -d '{
    "description": "image classification service",
    "mllib": "caffe",
    "model": {
        "init": "https://deepdetect.com/models/init/desktop/images/detection/detection_voc0712.tar.gz",
        "repository": "/opt/models/detection_voc0712",
    "create_repository": true
    },
    "parameters": {
        "input": {
            "connector": "image"
        }
    },
    "type": "supervised"
}
'

This should yield:


{
  “status”:{
    “code”:201,
    “msg”:“Created”
  }
}

And this is all it takes to setup the pre-trained model.

Testing object detection

We can now pass any image filepath or URL to our object detector, here is an example:


curl -X POST "http://localhost:8080/predict" -d '{
       "service":"imageserv",
       "parameters":{
         "output":{
           "bbox": true,
       "confidence_threshold": 0.1
         }
       },
       "data":["https://photos.wi.gcs.trstatic.net/e9hyHkaRFZdDV_jLZuTS6jcYq1eLUfiFzfl9zNavmNuoyZ-3UCX_EGg6D5TNU--V-f-z2CT8Kg0u3mF9gccUiA"]
     }'

yields:


{
  "status": {
    "msg": "OK",
    "code": 200
  },
  "body": {
    "predictions": [
      {
        "classes": [
          {
            "cat": "bird",
            "prob": 0.8333460688591003,
            "bbox": {
              "xmin": 67.03402709960938,
              "ymin": 414.25286865234375,
              "ymax": 64.85651397705078,
              "xmax": 354.663330078125
            }
          },
          {
            "cat": "person",
            "prob": 0.5956286191940308,
            "bbox": {
              "xmin": 75.99663543701172,
              "ymin": 475.9880676269531,
              "ymax": 66.72187805175781,
              "xmax": 363.94293212890625
            }
          },
          {
            "cat": "person",
            "prob": 0.2928898334503174,
            "bbox": {
              "xmin": 495.8335876464844,
              "ymin": 735.4041748046875,
              "ymax": 506.434326171875,
              "xmax": 652.080078125
            }
          },
          {
            "cat": "person",
            "prob": 0.24435117840766907,
            "bbox": {
              "xmin": 437.17041015625,
              "ymin": 540.1434936523438,
              "ymax": 111.70045471191406,
              "xmax": 633.19970703125
            }
          },
          {
            "cat": "bird",
            "prob": 0.16601955890655518,
            "bbox": {
              "xmin": 40.96523666381836,
              "ymin": 280.6235046386719,
              "ymax": 71.90843200683594,
              "xmax": 259.4865417480469
            }
          },
          {
            "cat": "person",
            "prob": 0.12583601474761963,
            "bbox": {
              "xmin": 358.8877868652344,
              "ymin": 763.8483276367188,
              "ymax": 532.8911743164062,
              "xmax": 491.5361022949219
            }
          },
          {
            "cat": "person",
            "last": true,
            "prob": 0.11492644995450974,
            "bbox": {
              "xmin": 213.4755096435547,
              "ymin": 793.69287109375,
              "ymax": 545.5011596679688,
              "xmax": 355.6097717285156
            }
          }
        ],
        "uri": "https://photos.wi.gcs.trstatic.net/e9hyHkaRFZdDV_jLZuTS6jcYq1eLUfiFzfl9zNavmNuoyZ-3UCX_EGg6D5TNU--V-f-z2CT8Kg0u3mF9gccUiA"
      }
    ]
  },
  "head": {
    "method": "/predict",
    "service": "imgserv",
    "time": 1903
  }
}

The resulting JSON contains:

  • bounding boxes as bbox JSON objects
  • the estimated category cat of the object
  • the confidence of the detection as a probability prob, the higher the better

Note that confidence_threshold allows to remove any prediction that has a prob strictly below the threshold.

You can look at the object detection Python script to generate the bounding boxes:

Bastille

Related