Music Genre Classification with Python

We shall then utilise the skills learnt to classify music clips into different genres.Audio Processing in PythonSound is represented in the form of an audio signal having parameters such as frequency, bandwidth, decibel etc..A typical audio signal can be expressed as a function of Amplitude and Time.sourceThese sounds are available in many formats which makes it possible for the computer to read and analyse them..Some examples are:mp3 formatWMA (Windows Media Audio) formatwav (Waveform Audio File) formatAudio LibrariesPython has some great libraries for audio processing like Librosa and PyAudio.There are also built-in modules for some basic audio functionalities.We will mainly use two libraries for audio acquisition and playback:1..LibrosaIt is a Python module to analyze audio signals in general but geared more towards music..It includes the nuts and bolts to build a MIR(Music information retrieval) system..It has been very well documented along with a lot of examples and tutorials.For a more advanced introduction which describes the package design principles, please refer to the librosa paper at SciPy 2015.Installationpip install librosaorconda install -c conda-forge librosaTo fuel more audio-decoding power, you can install ffmpeg which ships with many audio decoders.2..IPython.display.AudioIPython.display.Audio lets you play audio directly in a jupyter notebook.Loading an audio fileimport librosaaudio_path = '../T08-violin.wav'x , sr = librosa.load(audio_path)print(type(x), type(sr))<class 'numpy.ndarray'> <class 'int'>print(x.shape, sr)(396688,) 22050This returns an audio time series as a numpy array with a default sampling rate(sr) of 22KHZ mono..We can change this behaviour by saying:librosa.load(audio_path, sr=44100)to resample at 44.1KHz, orlibrosa.load(audio_path, sr=None)to disable resampling.The sample rate is the number of samples of audio carried per second, measured in Hz or kHz.Playing AudioUsing,IPython.display.Audio to play the audioimport IPython.display as ipdipd.Audio(audio_path)This returns an audio widget in the jupyter notebook as follows:screenshot of the Ipython audio widgetThis widget won’t work here, but it will work in your notebooks..I have uploaded the same to SoundCloud so that we can listen to it.You can even use an mp3 or a WMA format for the audio example.Visualizing AudioWaveformWe can plot the audio array using librosa.display.waveplot:%matplotlib inlineimport matplotlib.pyplot as pltimport librosa.displayplt.figure(figsize=(14, 5))librosa.display.waveplot(x, sr=sr)Here, we have the plot the amplitude envelope of a waveform.SpectrogramA spectrogram is a visual representation of the spectrum of frequencies of sound or other signals as they vary with time..Spectrograms are sometimes called sonographs, voiceprints, or voicegrams..When the data is represented in a 3D plot, they may be called waterfalls..In 2-dimensional arrays, the first axis is frequency while the second axis is time.We can display a spectrogram using.. More details

Leave a Reply