Skip to content

Unsupervised Music Speech Classification using Gaussian Mixture Models (EM algorithm)

License

Notifications You must be signed in to change notification settings

sm823zw/Music-Speech-Classification

Repository files navigation

Music-Speech-Classification

Unsupervised Music Speech Classification using Gaussian Mixture Models (Expectation Maximization algorithm)

Problem Description -

A set of training and test examples of music and speech are provided. www.leap.ee.iisc.ac.in/sriram/teaching/MLSP21/assignments/speechMusicData.tar.gz

Using these examples, a) Generate spectrogram features - Use the log magnitude spectrogram as before with a 64 component magnitude FFT (NFFT). In this case, the spectrogram will have dimension 32 times the number of frames (using 25 ms with a shift of 10 ms). b) Train two GMM models with K-means initialization (for each class) separately each with 5-mixture components with diagonal/full covariance respectively on this data. Plot the log-likelihood as a function of the EM iteration. c) Classify the test samples using the built classifiers and report the performance in terms of error rate (percentage of mis-classified samples) on the text data.

The iterative algorithm used for training the GMMs was implemented from scratch using only numpy and without using sklearn library functions.

Performance analysis -

Number of GMM Components Type of Covariance Accuracy(in %)
2 full 83.33
2 diagonal 70.83
5 full 97.92
5 diagonal 64.58

About

Unsupervised Music Speech Classification using Gaussian Mixture Models (EM algorithm)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published