YOU ONLY LOOK ONCE(Real-Time Object detection, YOLO)
This deep learning technique is used in self-driving cars nowadays
This tutorial covers real-time object detection Deep Learning Model(using YOLO) in google colab with TensorFlow on a custom dataset. (We will do all our work completely inside google colab it is much faster than own machine, and training YOLO is resource-intensive task)
YOLO is an extremely fast real-time object detection algorithm, this algorithm can detect multiple objects at the same time in a given in image. YOLO stands for “You Only Look Once”. You only look once (YOLO) is a state-of-the-art, real-time object detection system. On a Titan X, it processes images at 40-90 FPS(Frames Per Second) and has a mAP on VOC 2007 of 78.6% and a mAP of 48.1% on COCO test-dev.
How It Works
Prior object detection systems repurpose classifiers or localizers to perform detection. They apply the model to an image at multiple locations and scales, i.e to locate the object in the given image it has to go through whole image multiple times. High scoring regions of the image are considered detections.
YOLO uses a totally different approach. YOLO applies a single neural network to the full image. This neural network divides the image into regions and predicts bounding boxes and probabilities for each region. These bounding boxes are weighted by the predicted probabilities(Confidence Score).
YOLO model has several advantages over classifier-based object detection systems. It looks at the whole image at test time predictions made are informed by global context in the image. It also makes predictions with a single network evaluation, that makes it extremely fast, more than 1000x faster than R-CNN and 100x faster than Fast R-CNN.
If you want to learn more about architecture and mathematics involved in YOLO please read the original paper.
Watch this interesting video about Real-Time object detection it will motivate you to explore this topic even more.
Original paper (CVPR 2016. OpenCV People’s Choice Award): https://arxiv.org/pdf/1506.02640v5.pdf
Cool things about YOLO:
- Speed , detects object 45 Frame Per Second which is better than real time.
- Network understands generalized object representation i.e we can train our network on real world images and get prediction on artworks or Computer grapics generated images,and vice versa.
- faster version (with smaller architecture,sometimes known as Tiny YOLO) — detects 155 frames per sec but is less accurate than original YOLO.
- And above all YOLO is open source
Although there are a lot of pre-trained models on the internet on various datasets,
In this tutorial, We will train our own model, and detect objects that we are interested in. We will train our model in the cloud using GOOGLE COLAB, you only need a browser and working internet connection.
- Preparing Dataset
- Uploading everything to Google Drive
- Setting up environment in GOOGLE COLAB
- Training model
- Doing Prediction on Images and Video
- Saving models weight and Configuration file for future use:
In this part of the blog, we will cover the first three steps and other steps are covered in part2:
Step 1: Preparing Dataset
This is a crucial step and performance of your model depend on the quality of data you collected.
a) Search for images and videos related to the problem your problem domain
b) Be aware of what problem you are going to solve.
- For example in my case monkeys can be anywhere in the frame. They are of different shape colour and orientation.
- So i collected almos almost 5000 images from internet to train my model
- So you can also search and scrap internet as per your requirement
If You have video and want to take a screenshot then use ffmpeg
$ sudo apt install ffmpeg $ ffmpeg -i input.mp4 output.avi
To read more about what you can do with ffmpeg(changing frame number and times) go to this link
$ ffmpeg -i monkey2.mp4 -vf fps=1/3 thumb%04d.jpg -hide_banner
use this command to change your Frame per second (fps)
Making annotations by drawing rectangles:
We will use labelImg for this task:
(make sure you have pyqt installed)
$ git clone https://github.com/tzutalin/labelImg $ cd labelImg/ $ make all $ ./labelImg.py
To start again in future
$ cd labelImg/ $ python3 labelImg.py
Make two folders in your directory:
images and annotations
make boxes around the objects that you want to detect:
make sure to change 'change save dir' to the location of annotations folder.
Congrats now your dataset is ready to be uploaded in Google Drive
Step 2: Uploading everything to Google Drive
What I mean by everything
a)you Images and annotations file
b)Weight file of YOLO
We are using yolov2 because it is much faster
go to the link and download the weight file from there.
Upload everything to your drive:
(you have to upload three files named as images annotations and yolov2-voc.weights)
Congratulations You are now ready to set up your environment in the cloud
Step 3: Setting up an environment in GOOGLE COLAB:
(This part will be different for your problem domain so read it carefully)
- Fork my github repository so that you can make changes according to your requirements.
- file structure of the repository is:
3. Our main configuration file is :
You have to edit yolov2-voc-1c.cfg and labels.txt if you are training for more numbers of classes
(Do not do anything with yolov2-voc.cfg here is why )
A) Changing labels.txt
(Make sure your labels are same as you did during making annotations file using labelImg )
B) Changing yolov2-voc-1c.cfg:
I am only training on one class so, I named it as yolov2-voc-1c.cfg . Suppose you are training for 4 objects to detect renamed it as yolov2-voc-4c.cfg, This is a general convention followed in official implementation.
rules to change config files:
at line 244 change number of class
at line 237 change number of filters:
filters = (classes+5)*5
i.e for four class it will be equal to (5+4)*5=45
now everything is done related to configuration now it is time to install packages on colab
RUN FOLLOWING CODE IN DIFFERENT SHELL OF COLAB:
! git clone https://github.com/veshitala/darkflow.git #clone your own repository
! python3 setup.py build_ext --inplace
! pip install -e .
! pip install .
%matplotlib inline import matplotlib.pyplot as plt import numpy as np from darkflow.net.build import TFNet import cv2
## Start by connecting gdrive into the google colab from google.colab import drive drive.mount('/content/gdrive')
All right then we did great work till now.
- Dataset preparation
- Moving everything to Google Drive
- Setting up an environment in COLAB
Now comes the most exciting part of this project
Training and prediction are covered in Part2 of this blog.
Result that we'll achieve at the end.
For any query feel free to comment Down
Please Like👍 and share your valuable feedback and suggestions.