Show simple item record

dc.contributor.advisor Hang, Xiyi en_US Davis, Rosalind 2018-02-08T18:14:14Z 2018-02-08T18:14:14Z 2018 en_US 2018-02-08
dc.identifier.uri en
dc.description Includes bibliographical references (pages 82-88) en_US
dc.description.abstract Since 2015, the music industry has experienced a resurgence driven by online music sales and streaming, which has in turn been facilitated by very large archives of musical data. These large musical archives, however, remain challenging to search and index effectively, due to the scale of the data involved and the subjective, perceptual nature of how humans relate to music. Contemporary research in music information retrieval seeks to bridge this gap by using algorithmic analysis on features extracted from the underlying audio to automatically classify and identify perceptual features in music. This project applied three machine learning techniques (support vector classification, traditional neural networks, and convolutional neural networks) to two sets of audio features (Mel-frequency cepstral coefficients and the discrete wavelet transform) for the purposes of genre classification. Because convolutional neural networks have been used on images to great effect, the discrete wavelet transform data was used to map audio into the image domain, to leverage publicly available, pre-trained weight sets for four large, sophisticated image recognition networks. For all tasks, two subsets of a large, publicly available musical dataset were used, along with multiple training and optimization techniques. While all models were able to meet or exceed some pre-existing benchmarks for the genre classification task, support vector classification was found to yield better results, with a best overall test set accuracy of 61%, than either traditional neural networks (51.4%) or convolutional neural networks (40.5%) on an eight-genre multi-class classification task. The application of the pre-trained image recognition networks to audio wavelet data decreased training time, but was not found to yield accuracies comparable to the accuracies those networks achieved on image data. The small size of the dataset relative to datasets in other domains, the reuse of data augmentation techniques intended for use on images, and sub-optimal feature extraction techniques are suggested as factors in the inability of the machine-learning models evaluated in this project to achieve the quality of results observed in the image domain. Audio-native augmentation techniques and the use of ensemble models present worthwhile avenues for future investigation.
dc.description.statementofresponsibility by Rosalind M. Davis en_US
dc.format.extent xii, 99 pages en_US
dc.language.iso en_US en_US
dc.publisher California State University, Northridge en_US
dc.rights.uri en_US
dc.subject wavelet transform
dc.subject neural networks
dc.subject genre classification
dc.subject Mel-frequency cepstral coefficients
dc.subject discrete wavelet transform
dc.subject music information retrieval
dc.subject music classification
dc.subject support vector classification
dc.subject convolutional neural networks
dc.subject machine learning
dc.subject.other Dissertations, Academic -- CSUN -- Engineering -- Electrical and Computer Engineering. en_US
dc.title Feature extraction and machine learning techniques for musical genre determination
dc.type Thesis en_US 2018-02-08T18:14:14Z
dc.contributor.department California State University, Northridge. Department of Elec & Comp Engr en_US M.S. en_US
dc.contributor.committeeMember Van Alphen, Deborah K en_US
dc.contributor.committeeMember Flynn, James A en_US
dc.rights.license By signing and submitting this license, you the author grant permission to CSUN Graduate Studies to submit your thesis or dissertation, and any additional associated files you provide, to CSUN ScholarWorks, the institutional repository of the California State University, Northridge, on your behalf. You grant to CSUN ScholarWorks the non-exclusive right to reproduce and/or distribute your submission worldwide in electronic or any medium for non-commercial, academic purposes. You agree that CSUN ScholarWorks may, without changing the content, translate the submission to any medium or format, as well as keep more than one copy, for the purposes of security, backup and preservation. You represent that the submission is your original work, and that you have the right to grant the rights contained in this license. You also represent that your submission does not, to the best of your knowledge, infringe upon anyone's copyright. If the submission contains material for which you do not hold copyright, or for which the intended use is not permitted, or which does not reasonably fall under the guidelines of fair use, you represent that you have obtained the unrestricted permission of the copyright owner to grant CSUN ScholarWorks the rights required by this license, and that such third-party owned material is clearly identified and acknowledged within the text or content of the submission. If the submission is based upon work that has been sponsored or supported by an agency or organization other than the California State University, Northridge, you represent that you have fulfilled any right of review or other obligations required by such contract or agreement. CSUN ScholarWorks will clearly identify your name(s) as the author(s) or owner(s) of the submission, and will not make any alterations, other than those allowed by this license, to your submission. en_US

Files in this item


This item appears in the following Collection(s)

Show simple item record

Search DSpace

My Account

RSS Feeds