Object Detection with Tensorflow 2

Tensorflow Object Detection Guide
Tensorflow Object Detection Guide

There are many guides out there that are very good to help you get started with setting up the TF Object Detection API, but unfortunately, most of them are written for the TF v1 API.

We will take a look at how to use the TF v2 Object Detection API to build a model for a custom dataset on a Google Colab Notebook.

Before we begin the setup, make sure to change the runtime-type in Colab to GPU so that we can make use of the free GPU provided.

1. Installing Dependencies and setting up the workspace.

Create a folder for your workspace

%mkdir workspace
%cd /content/workspace

We will be cloning the TF repository from GitHub

!git clone --q https://github.com/tensorflow/models.git

And before we install TF Object Detection we must install Protobuf.

“The Tensorflow Object Detection API uses Protobufs to configure model and training parameters. Before the framework can be used, the Protobuf libraries must be downloaded and compiled”

!apt-get install -qq protobuf-compiler python-pil python-lxml python-tk
!pip install -qq Cython contextlib2 pillow lxml matplotlib
!pip install -qq pycocotools
%cd models/research/
!protoc object_detection/protos/*.proto --python_out=.

Now we install the TF Object Detection API

%cp object_detection/packages/tf2/setup.py .
!python -m pip install .
!python object_detection/builders/model_builder_tf2_test.py


2. Preparing the Dataset

There are two ways to go about this:

  • Use a Public Labelled Dataset
  • Create a Custom Labelled Dataset

You can find Public Labelled Datasets online, which are already labeled and saved in the right format, ready to be used to train.

For this tutorial, we will be creating our own dataset from scratch.

First things first, gather the images for the dataset. I will assume this step has already been done.

Now we need to label the images. There are many popular labeling tools, we will be using LabelIMG.

To install LabelIMG, execute the following code (Do it on your local Terminal since Colab does not support GUI applications):

pip install labelImg

Launch LabelImg in the folder where your images are stored.

labelImg imagesdir

Now you can start labeling your images, for more info on how to label the images follow this link (LabelImg Repository).

Label Img

Create a label map in notepad as follows (label_map.pbtxt) with two classes for example cars and bikes:

item {
id: 1
name: 'car'

item {
id: 2
name: 'bike'

Now for creating the TFRecord files.

We can do the following:

  • Create TFRecord ourselves
  • Upload the annotations to Roboflow and get the dataset in TFRecord Format.

Creating the TFRecords ourselves is a bit tedious as the XML created after annotating may sometimes vary, so for the sake of ease, I suggest using Roboflow to perform the above task. They also provide an option to perform additional Data Augmentation which will increase the size of the dataset.

For your reference, here is a sample .py script to create the TFRecords manually.

Use the above code for train and test images to create train.tfrecord and test.tfrecord respectively by changing

xml_dir = ‘images/test’
image_dir = ‘images/test’
output_path = 'annotations/test.record'

By using Roboflow you will be provided the TFRecord files automatically.

Setting up on Colab

Create folders to store all the necessary files we have just created.

%mkdir annotations exported-models pre-trained-models models/my_mobilenet # my_mobilenet folder is where our training results will be stored

Now upload the newly created TFRecord files along with the images and annotations to Google Colab by clicking upload files.

You could use Google Drive to store your necessary files and importing those to Google Colab should be as simple as doing a !cp command.

Download Pre-Trained Model

There are many models ready to download from the Tensorflow Model Zoo.

Be careful in choosing which model to use as some are not made for Object Detection. For this tutorial we will be using the following model:

SSD MobileNet V2 FPNLite 320x320.

Download it into your Colab Notebook and extract it by executing:

%cd pre-trained-models
!curl "http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz" --output "ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz"
model_name = 'ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8'model_file = model_name + '.tar.gz'tar = tarfile.open(model_file)tar.extractall()tar.close()os.remove(model_file)

Your directory structure should now look like this:

├─ models/
│ ├─ community/
│ ├─ official/
│ ├─ orbit/
│ ├─ research/
│ ├─ my_mobilenet/
│ └─ ...
├─ annotations/
│ ├─ train/
│ └─ test/
├─ pre-trained-model/
├─ exported-models/

Editing the Configuration file

In TF Object Detection API, all the settings and required information for training the model and evaluating is situated in the pipeline.config file.

Let us take a look at it:

The most important ones we will need to change are

batch_size: 128fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED"num_steps: 50000
num_classes: 2
fine_tune_checkpoint_type: "classification"train_input_reader {
label_map_path: "PATH_TO_BE_CONFIGURED"
tf_record_input_reader {
eval_input_reader {
label_map_path: "PATH_TO_BE_CONFIGURED"
shuffle: false
num_epochs: 1
tf_record_input_reader {

batch_size is the number of batches the model will train in parallel. A suitable number to use is 8. It could be more/less depending on the computing power available.

A good suggestion given on StackOverflow is:

Max batch size= available GPU memory bytes / 4 / (size of tensors + trainable parameters)

fine_tune_checkpoint is the last trained checkpoint (a checkpoint is how the model is stored by Tensorflow).

If you are starting the training for the first time, set this to the pre-trained-model.

If you want to continue training on a previously trained checkpoint, set it to the respective checkpoint path. (This will continue training, building upon the features and loss instead of starting from scratch).

# For Fresh Training
fine_tune_checkpoint: "pre-trained-model/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8/checkpoint/ckpt-0"
# For Contuining the Training
fine_tune_checkpoint: "exported_models/your_latest_batch/checkpoint/ckpt-0"
batch_size = 8 # Increase/Decrease this value depending on how fast your train job runs and the availability of the Compute Resources.num_steps: 25000 # 25000 is a good number of steps to get a good loss.fine_tune_checkpoint_type: "detection" # Set this to detectiontrain_input_reader {
label_map_path: "annotations/label_map.pbtxt" # Set to location of label map
tf_record_input_reader {
input_path: "annotations/train.tfrecord
" # Set to location of train TFRecord file
# Similarly do the same for the eval input reader
eval_input_reader {
label_map_path: "annotations/label_map.pbtxt"
shuffle: false
num_epochs: 1
tf_record_input_reader {
input_path: "annotations/test.tfrecord"

After editing the config file, we need to add the TensorFlow object detection folders to the python path.

import osos.environ['PYTHONPATH'] += ':/content/window_detection/models/:/content/window_detection/models/research/:/content/window_detection/models/research/slim/'

Setting up TensorBoard on Colab to monitor the training process

Colab has introduced inbuilt support for TensorBoard and can now be called with a simple magic command as follows

%load_ext tensorboard
%tensorboard --logdir 'models/my_mobilenet'


This is how the cell will look once you execute the above command, but nothing to worry, once we start the training job, click refresh on the Tensorboard cell(Top Right) after a few minutes(The .tfevent files need to be created for us to monitor the TensorFlow logs) and you will see the output on the TensorBoard magic cell

Running the Training Job

We will copy the TensorFlow training python script to the workspace directory for ease of access.

!cp '/content/window_detection/models/research/object_detection/model_main_tf2.py' .

The training job requires command-line arguments, namely:

  • model_dir : This refers to the path where the training process will store the checkpoint files.
  • pipeline_config_path : This refers to the path where the pipeline.config file is stored

Execute the following command to start the training job

# If you are training from scratch
!python model_main_tf2.py --model_dir=models/my_mobilenet --pipeline_config_path=pre-trained-model/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8/pipeline.config
# Or if you are continuing from a previous training
!python model_main_tf2.py --model_dir=models/my_mobilenet --pipeline_config_path=exported_models/pipeline.config

If everything goes well, the training output cell should look like this

Training Output
Training Output

The output will normally update slowly. The training outputs logs only every 100 steps by default, therefore if you wait for a while, you should see a log for the loss at step 100. The speed depends on whether a GPU is being used to train and the available VRAM and many other factors, so be patient.

Refresh the TensorBoard while the training is running and you will be able to monitor the progress

Tensorboard Output
Tensorboard Output

Once the loss reaches a fairly constant value or becomes lower than 0.05(in my case), then you can stop the training cell.

Evaluating the model

Now you can run the evaluation script to find out the mAP (Mean Average Precision) and the Loss.

Run the following in a cell:

!python model_main_tf2.py --model_dir=exported-models/checkpoint --pipeline_config_path=exported-models/pipeline.config --checkpoint_dir=models/my_mobilenet/checkpoint # The folder where the model has saved the checkpoints during training

You should get an output that looks like this

Evaluation Output
Evaluation Output

Now the evaluation script has a default timeout of 3600 seconds to wait for a new checkpoint to be generated as the script was initially intended to be running in parallel to the training job, but we are running it after the training process on Colab

You may go ahead and stop the evaluation cell from running.

Exporting the model

Now that we have our model ready, we need to save it in a format we can use it later.

We now have a bunch of checkpoints in the models/my_mobilenet folder. To remove all the older checkpoints and keep the latest checkpoint, I have attached a neat little python script that will do the task automatically.

Now to export the model, we run the export script provided by TF2, as follows:

!python /content/workspace/models/research/object_detection/exporter_main_v2.py --input_type=image_tensor \
--output_directory=exported_models \

The export script will save the model in the exported_models folder with the following structure

├─ exported_models/
├─ checkpoint/
├─ saved_model/
├─ pipeline.config

You can now upload this folder to Google Drive or download it to save it for future use.

Inference on the model

The final step, the step that fills you with a sense of accomplishment, in this step we will test our model on a random input image and see the model predict the type of object and give its bounding box.

The entire process is a little tedious but I will attach a script that will let you perform inference directly on Google Colab

The output of the inference should be like this


You can use the above script to fashion it into using a video as an input and perform inference on that.


Congratulations! You have built an object detection model with TensorFlow 2.

That’s it for the tutorial! Hope you face no issues while following along, if there are any questions please comment and I will respond to your queries.

Refer to the Tensorflow Github page.

My socials -> LinkedInMedium

Thank you for reading!


Popular Posts