Quantitative Analysis of Audio Descriptors for Emotion-Based Music Classification

Dubey, Sarvagya

doi:https://doi.org/10.55041/ijsmt.v2i5.184

Plagiarism Passed

Peer reviewed

Open Access

QUANTITATIVE ANALYSIS OF AUDIO DESCRIPTORS FOR EMOTION-BASED MUSIC CLASSIFICATION

AUTHORS:

Sarvagya Dubey

Mentor

Dr. Prabha Nair

Affiliation

Department of Information Technology, Noida Institute of Engineering & Technology, Greater Noida

DOI: 10.55041/ijsmt.v2i5.184

CC BY 4.0 License:

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

DOWNLOAD ARTICLE

REVIEW REPORT

CITE THIS ARTICLE

Abstract

Modern music streaming services have made it easy to access music. Their recommendation systems still have a big problem. They mostly use methods that look at what people listened to before and what the music is about but they don't think about how the listener is feeling right now. This research tries to fix that problem by creating a system that can classify music based on how it sounds. The system uses a way of processing audio signals to turn them into pictures that show what the music sounds like. These pictures are then put into a kind of computer model that can recognize patterns in the music. The model is trained on a dataset of music from different sources. The music is recorded at a quality of 22,050 Hz and turned into special pictures called Log-Mel Spectrograms. These pictures are 128 pixels wide. 376 Pixels tall, which helps the model understand the music. The model is a type of deep learning model called a Convolutional Neural Network (CNN). It uses some techniques like Batch Normalization and Adam to help it learn. The results show that the model is really good at classifying music with an accuracy of 88.4%. It is especially good at recognizing music that's energetic or happy with precisions of 0.94 and 0.91. The model has a harder time recognizing music that is sad or calm because they can sound similar. The model sometimes gets confused. It thinks sad music is calm or vice versa. To prevent this the researchers tried settings and found that a high dropout rate, between 0.3 and 0.5 helped the model learn better. The model can also process music in time taking less than 150ms to classify a song. This means it can be used with music streaming services, like Spotify to give people music recommendations. Overall, this research shows that deep learning can be used to create an empathetic and responsive music recommendation system.

Keywords

Article Information

Article Metrics

Article Views

PDF Downloads

HOW TO CITE

References

Ethics and Compliance

✓ All ethical standards met

This article has undergone plagiarism screening and double-blind peer review. Editorial policies have been followed. Authors retain copyright under CC BY-NC 4.0 license. The research complies with ethical standards and institutional guidelines.

Indexed In

International Journal of Science, Strategic Management and Technology

ISSN: 3108-1762 (Online)

QUANTITATIVE ANALYSIS OF AUDIO DESCRIPTORS FOR EMOTION-BASED MUSIC CLASSIFICATION

About Journal

Policies & Ethics

Indexing Platforms

Contact Us

QUANTITATIVE ANALYSIS OF AUDIO DESCRIPTORS FOR EMOTION-BASED MUSIC CLASSIFICATION

About Journal

Contact Us

Share on