Baseline Model-Classical Classification Models
- Genre Oracle
- Nov 2, 2018
- 1 min read
Besides the Bayes, we solve the classification problem by using classical classification models- SVM, Logistic Regression and KNN.
We evenly abstract the 20340 songs from 14 genres, that means, for each genre, we have the same number of samples.
We take all the 4873 words appear in the songs as features, ranked by the frequency descending. Also, we take 20340 songs as samples.
Here is the sample of X and y.
X:(20340*4873)

y:(20340*1)

Here is the accuracy table:

We chose the LinearSVC, the one with best performance, as the base of our improved model, here is the Confusion matrix:

Comments