On the Margins: Non-maximum Suppression with Tensorflow

I’m writing a series of posts on supercharging object detection inference performance in video streams using Tensorflow and cool tech from NVIDIA: step-by-step, starting from 6 fps all the way up to 230. But before I start, this small post is about a cool little gem, which I think is often overlooked.

Anyone in the object detection business knows that non-maximum suppression is indispensable to it. But try finding an implementation in a machine learning library of your choice! I couldn’t. Naturally there are lots of implementations on GitHub, so not a big deal, but still I would think that if I am using a machine learning framework something so crucial should just fall out of the box.

Turns out Tensorflow does this! Thank you, Google!
So here is my five-liner to wrap it:

def non_max_suppression_with_tf(sess, boxes, scores, max_output_size, iou_threshold=0.5):
    Provide a tensorflow session and get non-maximum suppression
    max_output_size, iou_threshold are passed to tf.image.non_max_suppression 
    non_max_idxs = tf.image.non_max_suppression(boxes, scores, max_output_size, iou_threshold=iou_threshold)
    new_boxes = tf.cast(tf.gather(boxes, non_max_idxs), tf.int32)
    new_scores = tf.gather(scores, non_max_idxs)
    return sess.run([new_boxes, new_scores])

Note, that boxes and scores can be numpy arrays. This does require an active Tensorflow session for the results to materialize.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.