Become a Video Analysis Expert: A Simple Approach to Automatically Generating Highlights using Python

An audio signal can be analyzed in the time or frequency domain.

In the time domain, an audio signal is analyzed with respect to the time component, whereas in the frequency domain, it is analyzed with respect to the frequency component: The energy or power of an audio signal refers to the loudness of the sound.

It is computed by the sum of the square of the amplitude of an audio signal in the time domain.

When energy is computed for a chunk of an entire audio signal, then it is known as Short Time Energy.

Source: facto-facts.

com The basic idea behind the solution is that in most sports, whenever an interesting event occurs, there is an increase in the commentator’s voice as well as the spectators’.

Let’s take cricket for example.

Whenever a batsman hits a boundary or a bowler takes a wicket, there is a rise in the commentator’s voice.

The ground swells with the sound of the spectators cheering.

We can use these changes in audio to capture interesting moments from a video.

Here is the step-by-step process: Input the full match video Extract the audio  Break the audio into chunks Compute short-time energy of every chunk Classify every chunk as excitement or not (based on a threshold value) Merge all the excitement-clips to form the video highlights Understanding the Problem Statement Cricket is the most famous sport in India and played in almost all parts of the country.

So, being a die-hard cricket fan, I decided to automate the process of highlights extraction from a full match cricket video.

Nevertheless, the same idea can be applied to other sports as well.

For this article, I have considered only the first 6 overs (PowerPlay) of the semi-final match between India and Australia at the T20 World Cup in 2007.

You can watch the full match on YouTube here and download the video for the first six overs from here.

  Automatic Highlight Generation in Python I have extracted the audio from the video with the help of a software called WavePad Audio Editor.

You can download the audio clip from here.

View the code on Gist.

We can get the duration of the audio clip in minutes using the code below: View the code on Gist.

Now, we will break the audio into chunks of 5 seconds each since we are interested in finding out whether a particular audio chunk contains a rise in the audio voice:  View the code on Gist.

Let us listen to one of the audio chunks: View the code on Gist.

Compute the energy for the chunk: View the code on Gist.

Visualize the chunk in the time-series domain: View the code on Gist.

As we can see, the amplitude of a signal is varying with respect to time.

Next, compute the Short Time Energy for every chunk: View the code on Gist.

Let us understand the Short Time Energy distribution of the chunks: View the code on Gist.

The energy distribution is right-skewed as we can see in the above plot.

We will choose the extreme value as the threshold since we are interested in the clips only when the commentator’s speech and spectators cheers are high.

Here, I am considering the threshold to be 12,000 as it lies on the tail of the distribution.

Feel free to experiment with different values and see what result you get.

View the code on Gist.

Merge consecutive time intervals of audio clips into one: View the code on Gist.

Extract the video within a particular time interval to form highlights.

 Remember – Since the commentator’s speech and spectators’ cheers increase only after the batsman has played a shot, I am considering only five seconds post every excitement clip: View the code on Gist.

I have used online editors to merge all the extracted clips to form a single video.

Here are the highlights generated from the PowerPlay using a simple speech analysis approach:   Congratulations on making it this far and generating your own highlight package!.Go ahead and apply this technique to any match or sport you want.

It might appear straightforward but it’s such a powerful approach.

  End Notes The key takeaways from the article – have a thorough understanding of the domain as well as the data before getting into the model building process since it drives us to a better solution in most of the problems.

In this article, we have seen how to automate the process of highlight extraction from a full match sports video using simple speech analysis.

I would recommend you to experiment in different sports too.

Liked the article?.Want to share a different approach?.Feel free to connect with me in the comments section below!.And if you’re looking to learn Python, here’s a FREE course for you: Python for Data Science You can also read this article on Analytics Vidhyas Android APP Share this:Click to share on LinkedIn (Opens in new window)Click to share on Facebook (Opens in new window)Click to share on Twitter (Opens in new window)Click to share on Pocket (Opens in new window)Click to share on Reddit (Opens in new window) Related Articles (adsbygoogle = window.

adsbygoogle || []).

push({});.. More details

Leave a Reply