Conclusions, Limits, and Future Work
- Genre Oracle
- Nov 30, 2018
- 1 min read
Conclusions:
In summary, we found that genre, an intrinsically subjective and supposedly unbounded concept, has an underlying classification structure that allows for quantitative analysis. Indeed, our project was a resounding success: with a very high degree of accuracy-96%, we were able to predict a song’s genre based on its lyrical content.
We also found that music has changed substantially over the years. In the span of several decades, the lyrical composition of many genres was found to decrease in complexity (i.e., word length) and increase in their usage of profanity.
Limits:
We had many ideas for this project that were hindered by our time constraints.
Firstly, our data set is only up to 2010. Songs might be different now.
Secondly, based on the cool performance of our classifier, we didn't use NLP. In the next steps, we could dig the meaning and emotion of the lyrics by package NLKT, and apply them into new features to our SVC model..
Future Work:
As to the future plan, besides continuing improving the SVC model, we also want to try other models on the classification problem.
Firstly, we want to considering all effective models (including decision tree) and use random forest to give a more credible final classification result.
Secondly, the label in out data set is manually input by customers, so it may be customized. We want to build another unsupervised model, like k-means, to see if the clustering result is similar with customers' opinion.
Comments