
Member-only story
Real-time Object Detection Without Machine Learning
Deep Learning vs. Heuristics
Earlier this year Nick Bourdakos, a software developer at IBM posted a series of videos demoing real-time object detection in a web browser. One of his early videos went viral, receiving over 16,000 likes and 900+ comments on LinkedIn. Here’s the original post:
The video shows three bottles (Coke, Pepsi, and Mountain Dew) being recognised by the computer in real-time as they are held up to the camera. When each bottle is detected, it is given a text label and a bounding box is drawn around it. If more than one bottle is held up, the system will correctly label the different bottles.
Nick’s system has now evolved into IBM cloud annotations, but the demo above used TensorFlow.js along with the COCO-SSD deep learning model. SSD, or Single Shot MultiBox Detector, is a widely used technique for detecting multiple sub-images in a frame, described in detail here. This is a task deep learning excels at and these techniques are now so widespread, you probably have a deep learning network in your pocket, running your phone’s object detection for photos or social networking apps.
The “no machine learning” challenge
Inspired by Nick’s post, I decided to challenge myself to explore if similar results could be achieved without the use of machine learning. It struck me that the bottles used in the original demo could be detected based on their colour or other characteristics along with some simple matching rules. This is known as an heuristic approach to problem solving.
Potential advantages of this include:
- Ease of development and conceptualisation
- Lower CPU and memory use
- Fewer dependencies
In terms of CPU and memory, on my i5 MacBook Pro, the IBM Cloud Annotations demo uses over 100% CPU and more than 1.5 Gigabytes of RAM. It also relies on a web browser and some heavy dependencies…