http://www.youtube.com/watch?v=rR6EIakYSZ0 In the following series I will explore different tools and techniques for doing object detection in streaming video in real time or faster. Starting with the baseline Python detector running slowly and gradually picking up speed. In these series In the course of these posts we will explore optimizing object detection in videos. We … Continue reading Supercharging Object Detection in Video: from Glacial to Lightning Speed
Tag: CUDA
Supercharging Object Detection in Videos: Setup
We started from the Python object detector performance as baseline (~ 19 fps). Next we ditch Python and all our pre-installed libraries and custom build everything. C++ will become the development environment not just because it's more "bare bones" than Python and thus more performant but also to access functionality not available in Python. Environment … Continue reading Supercharging Object Detection in Videos: Setup
Supercharging Object Detection in Video: First App
Tensorflow C++ Video Detector It is time to validate all this arduous setup work, run our first C++ detector and reap the first benefits. You may clone this repository, which is a fork of this repository, modified and adapted to the modern times. Ensuring the Right Build Paths Note the following excerpt from CMakeLists.txt: The … Continue reading Supercharging Object Detection in Video: First App
Supercharging Object Detection in Video: Optimizing Decoding and Graph Feeding
In the previous post we validated our install and ran a simple detector in C++. It is now time to start optimizing it. Source code for the finished project is here. Optimizing Video Decoding If we build and run the video_reader.cpp OpenCV sample, we will observe a staggering performance improvement available in OpenCV for decoding … Continue reading Supercharging Object Detection in Video: Optimizing Decoding and Graph Feeding
Supercharging Object Detection in Video: TensorRT 5
Source code for the finished project is here. NVIDIA TensorRT is a framework used to optimize deep networks for inference by performing surgery on graphs trained with popular deep learning frameworks: Tensorflow, Caffe, etc. Preparing the Tensorflow Graph Our code is based on the Uff SSD sample installed with TensorRT 5.0. The guide together with … Continue reading Supercharging Object Detection in Video: TensorRT 5
Zooming Through Euler Path: Supercharging with GPU
So, continuing where we left off: Walking the Euler Path: Intro Visualizing Graphs Walking the Euler Path: GPU for the Road Walking the Euler Path: PIN Cracking and DNA Sequencing For the Win And finally I ran the GPU-enabled algorithm for finding the Euler path. And the results: Generating euler graph: vertices = 10,485,760; avg … Continue reading Zooming Through Euler Path: Supercharging with GPU
Walking the Euler Path: GPU for the Road
Continuation of the previous posts: Intro Visualization GPU Digression I was going to talk about something else this week but figured I'd take advantage of the free-hand format and digress a bit. Continuing the travel metaphor and remembering Julius Cesar's "alea iacta", we'll talk about GPU algorithms, for which I invariably use my favorite Aela.CUDA … Continue reading Walking the Euler Path: GPU for the Road
Walking the Euler Path: Intro
Source Code I'm thinking about a few posts in these series going very fast through the project. The source is on my GitHub, check out the tags since the master branch is still work in progress. Experimenting with Graph Algorithms with F# and GPU Graphs play their role in bioinformatics which is my favorite area … Continue reading Walking the Euler Path: Intro
Look-and-say: [Alea.]CUDA
Continuing the Advent of Code theme from the previous post. Figured since this year is going to be my year of CUDA, this would be a good opportunity to take it for a ride. A good April 1st post, but why wait? So, how can we make this even faster than the already fast imperative … Continue reading Look-and-say: [Alea.]CUDA
Non-linear Thinking with CUDA.
I love GPU programming for precisely this: it forces and enables you to think about a solution in a non-linear fashion in more than one sense of the word. The Problem Given a set $latex A = \{a_1, a_2 \ldots a_n\}$, output a set $latex S_A = \{0,\ \sum\limits_{k=1}^{n} a_k,\ \sum\limits_{k=i}^{i + j \mod n} … Continue reading Non-linear Thinking with CUDA.