How to Process Live Video Stream Using FFMPEG and OpenCV

22 Jun 2017
How to Process Live Video Stream Using FFMPEG and OpenСV

When you are performing a good "card trick" all the details and complexity should be invisible to the observer. Magic should appear smooth and natural! Today we'll take a look behind the scenes. Enter Live Stream.

Most of the social media supports live streaming. Such as Youtube, Facebook, Snapchat, Instagram... LIVE mode is a common thing nowadays. If your product isn't Live - add it in Todo list! 

Live Stream (Broadcast) is not peer-to-peer data flow model. It`s a more complex solution. That's what we’re gonna talk about. Backstage. Let's walk through classified “LIVE”.

How to stream live?

There are the 3 stages that build up the Stream: Capture, Encode and Go Live. Now more about each of them.

  • Capture - on this stage we capture input stream as a raw data. It could be a file or another stream used as a source.
  • Encode - preparing and formatting the input stream. Encode to RBG/HSV pixel format to be able to analyze and edit each frame. Compress output via codec to improve productivity and decrease latency.
  • Go Live - create shared stream endpoint with multi-connection support.


How to Process Live Video Stream Using FFMPEG and OpenCV


So what's cooking? How do you create a successful point and handle the first stream frame. The main tool for that is FFmpeg lib. FFmpeg is a free software project that produces libraries and programs for handling multimedia data. No matter what source you are going to utilize with FFmpeg (screen, camera, file) - you can even set it up with a command line:

Mac OS. AVFoundation media device list

How to Process Live Video Stream Using FFMPEG and OpenCV

… capture screen device.

Mac OS. AVFoundation screen device

How to Process Live Video Stream Using FFMPEG and OpenCV

Based on ffmpeg, openCV lib uses the same principles to handle the stream source:

As a result we see a window with current stream capture.

cv::Mat frame - current frame object. cv::Mat object - represents 2D pixel matrix with HSV or BGR pixel format. Rows and column represent pixel matrix which are an intermediate format in the streaming process.

Video analysys

How to Process Live Video Stream Using FFMPEG and OpenCV

OpenCV - Originally developed by Intel's research center, as for me, it is the greatest leap within computer vision and media data analysis. The main thing to note about OpenCV is the high performance analysis using 2d pixel matrix. Over 30 frames per second with top quality makes around 30 millions pixel per second. You must be thinking to yourself  “That’s a high load, isn't” ? It means that the analysis should be very fast to keep your CPU running. Using multi core process and low level optimization gives way to a super fast library analysis. Definitely is a tool kit with lot of algorithms. OpenCV is a main tool in Artificial intelligence when we are talking about media content. What can we analyze? - Well, basically in general anything that can be 2d matrix. Let's take a look at simple code to detect the card in the frame.

Object tracking


An example of a playing card by reduction to the maximum contrast. In one word - THRESHOLD. This method provides black and white zone which describes object’s area.  

How to Process Live Video Stream Using FFMPEG and OpenCV

Threshold method takes input and output frames, thresh value, and thresh strategy as a parameters.

Threshold provides the most satisfactory result incase there’s a high contrast in an image. Also we could try to detect edges to describe the object in the frame. Canny - Edge Detector. Detects edges between most colours that differ and or contrast values.

How to Process Live Video Stream Using FFMPEG and OpenCV

Playing of a couple of methods we can detect contour.

Contours - Array of contour detected in the frame. Comparison, measurement and we’re getting closer to our target.

How to Process Live Video Stream Using FFMPEG and OpenCV

Card detection from a contour

Step 1. Read the video file

Step 2.

  • Reduce the colour to black and white
  • Blur the frame to approximate edges
  • Reduce the contrast

Step 3. Find contours

Step 4. Measurement of the perimeter

Face detection

How to Process Live Video Stream Using FFMPEG and OpenCV


“WIKI: Cascading is a particular case of ensemble learning based on the concatenation of several Classifiers, using all information collected from the output from a given classifier as additional information for the next classifier in the cascade. Unlike voting or stacking ensembles, which are multi expert systems, cascading is a multistage one.”

The name of the technology gives us a clear understanding of how it works. Cascading classification is like a recursive search of required feature within a single frame. Example: the human face is comprised of elementary geometrical shapes which could be described using an XML file. Every graphic has an abstract geometrical hierarchy. Painters are using this technique to train their portrait skills. Basically, we decide the centre of an object and draw a bounding oval. Eye position, nose … lips and ears…every next node placed on the basis of the previous one. Move from general to clarifying to find a face on a single frame.

CASCADECLASSIFIER results from studies of artificial intelligence. Analyzing the data set and with the comparison, we get the classification object.

Face detection in OpenCV

haarcascade_frontalface_alt.xml - combine human face data model feature.

Through the Network

How to Process Live Video Stream Using FFMPEG and OpenCV

The process of obtaining and processing is completed. Now it's time to pass it on to FFMPEG. Let's create an output stream.

We are familiar with cv::Mat frame. It is great for analysis but not for data transfer. It's time to use a codec - compression. First, you can define a difference between cv::mat frame and AVPicture pixels format. Yuv420p for AVPicture and BGR for cv::Mat. To achieve fast output we are packing stream via H.264 or MPEG-4 codec.

INIT Stream

WIKI:  The Real Time Streaming Protocol (RTSP) is a network control protocol designed for use in entertainment and communications systems to control streaming media servers. The protocol is used for establishing and controlling media sessions between end points. Clients of media servers issue VCR-style commands, such as play, record and pause, to facilitate real-time control of the media streaming from the server to a client (Video On Demand) or from a client to the server (Voice Recording).

Initialize output stream. Define a codec. Setup the buffers. Setup the headers.

Configure a codec

Connection config. Compression level, quality of the stream

Prepare cv::Mat frame to write. Use Software scaling context (SwsContext) to create AVPicture from cv::Mat.

Write each single frame.


FFMpeg / OpenCV are the force in media content manipulation. Low level of the (C/C++) tools allows us to analyse videos in a real-time. It actually looks like magic - all details within.

Hope you have found this article useful, don't forget to leave your comments below the article. If you have a project idea in mind but don't know where to start, we're always here to help you