TY - GEN
T1 - A bag-of-tones model with MFCC features for musical genre classification
AU - Qin, Zengchang
AU - Liu, Wei
AU - Wan, Tao
PY - 2013
Y1 - 2013
N2 - Musical genres are categorical labels created by humans to characterize pieces of music. These labels may be highly subjective but typically are related to the instrumentation, rhythmic structure, and harmonic content of the music. In this paper, we propose a model for music genre classification. The new model is referred to as the bag-of-tones (BOT) model which follows the conceptually similar idea of the bag-of-words (BOW) model in natural language processing and the bag-of-feature (BOF) model in image processing. The basic low-level music features such as Mel-frequency cepstral coefficients (MFCC) are clustered into a set of codewords referred to as "tones". By using such a model, each piece of music can be represented by a new feature vector of distribution on tones. Classical machine learning models such as support vector machines (SVM) can be applied for genre classification. The model is tested using two datasets. We found that the polynomial kernel function has the best performance in the SVM classification. By comparing to the previous work, we found the new proposed model outperform classical models on a given benchmark dataset. In general, this model can be used to structure the large collections of music available on the Web. It can play an important role in automatic digital music categorization and retrieval.
AB - Musical genres are categorical labels created by humans to characterize pieces of music. These labels may be highly subjective but typically are related to the instrumentation, rhythmic structure, and harmonic content of the music. In this paper, we propose a model for music genre classification. The new model is referred to as the bag-of-tones (BOT) model which follows the conceptually similar idea of the bag-of-words (BOW) model in natural language processing and the bag-of-feature (BOF) model in image processing. The basic low-level music features such as Mel-frequency cepstral coefficients (MFCC) are clustered into a set of codewords referred to as "tones". By using such a model, each piece of music can be represented by a new feature vector of distribution on tones. Classical machine learning models such as support vector machines (SVM) can be applied for genre classification. The model is tested using two datasets. We found that the polynomial kernel function has the best performance in the SVM classification. By comparing to the previous work, we found the new proposed model outperform classical models on a given benchmark dataset. In general, this model can be used to structure the large collections of music available on the Web. It can play an important role in automatic digital music categorization and retrieval.
KW - Bag-of-tones
KW - Bag-of-words
KW - MFCC
KW - Musical genre classification
UR - https://www.scopus.com/pages/publications/84893027328
U2 - 10.1007/978-3-642-53914-5_48
DO - 10.1007/978-3-642-53914-5_48
M3 - 会议稿件
AN - SCOPUS:84893027328
SN - 9783642539138
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 564
EP - 575
BT - Advanced Data Mining and Applications - 9th International Conference, ADMA 2013, Proceedings
T2 - 9th International Conference on Advanced Data Mining and Applications, ADMA 2013
Y2 - 14 December 2013 through 16 December 2013
ER -