IJSMT Journal

International Journal of Science, Strategic Management and Technology

An International, Peer-Reviewed, Open Access Scholarly Journal Indexed in recognized academic databases · DOI via Crossref The journal adheres to established scholarly publishing, peer-review, and research ethics guidelines set by the UGC

ISSN: 3108-1762 (Online)
webp (1)

Plagiarism Passed
Peer reviewed
Open Access

QUANTITATIVE ANALYSIS OF AUDIO DESCRIPTORS FOR EMOTION-BASED MUSIC CLASSIFICATION

AUTHORS:
Sarvagya Dubey
Mentor
Dr. Prabha Nair
Affiliation
Department of Information Technology, Noida Institute of Engineering & Technology, Greater Noida
CC BY 4.0 License:
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract

Modern music streaming services have made it easy to access music. Their recommendation systems still have a big problem. They mostly use methods that look at what people listened to before and what the music is about but they don't think about how the listener is feeling right now. This research tries to fix that problem by creating a system that can classify music based on how it sounds. The system uses a way of processing audio signals to turn them into pictures that show what the music sounds like. These pictures are then put into a kind of computer model that can recognize patterns in the music. The model is trained on a dataset of music from different sources. The music is recorded at a quality of 22,050 Hz and turned into special pictures called Log-Mel Spectrograms. These pictures are 128 pixels wide. 376 Pixels tall, which helps the model understand the music. The model is a type of deep learning model called a Convolutional Neural Network (CNN). It uses some techniques like Batch Normalization and Adam to help it learn. The results show that the model is really good at classifying music with an accuracy of 88.4%. It is especially good at recognizing music that's energetic or happy with precisions of 0.94 and 0.91. The model has a harder time recognizing music that is sad or calm because they can sound similar. The model sometimes gets confused. It thinks sad music is calm or vice versa. To prevent this the researchers tried settings and found that a high dropout rate, between 0.3 and 0.5 helped the model learn better. The model can also process music in time taking less than 150ms to classify a song. This means it can be used with music streaming services, like Spotify to give people music recommendations. Overall, this research shows that deep learning can be used to create an empathetic and responsive music recommendation system.

Keywords
Article Metrics
Article Views
40
PDF Downloads
0
HOW TO CITE
APA

MLA

Chicago

Copy

Dubey, S. (2026). Quantitative Analysis of Audio Descriptors for Emotion-Based Music Classification. International Journal of Science, Strategic Management and Technology, 02(05). https://doi.org/10.55041/ijsmt.v2i5.184

Dubey, Sarvagya. "Quantitative Analysis of Audio Descriptors for Emotion-Based Music Classification." International Journal of Science, Strategic Management and Technology, vol. 02, no. 05, 2026, pp. . doi:https://doi.org/10.55041/ijsmt.v2i5.184.

Dubey, Sarvagya. "Quantitative Analysis of Audio Descriptors for Emotion-Based Music Classification." International Journal of Science, Strategic Management and Technology 02, no. 05 (2026). https://doi.org/https://doi.org/10.55041/ijsmt.v2i5.184.

References
[1] J. A. Russell, "A circumplex model of affect," Journal of Personality and Social Psychology, vol. 39, no. 6, pp. 1161–1178, 1980.

[2] G. Tzanetakis and P. Cook, "Musical genre classification of audio signals," IEEE Transactions on Speech and Audio Processing, vol. 10, no. 5, pp. 293–302, 2002.

[3] R. Panda, R. Malheiro, and R. P. Paiva, "Novel Audio Features for Music Emotion Recognition," IEEE Transactions on Affective Computing, vol. 11, no. 4, pp. 608–621, 2020.

[4] X. Jia, "Music Emotion Recognition Based on Deep Learning: A Review," IEEE Access, vol. 12, pp. 1–15, 2024.

[5] J. Kang and D. Herremans, "Are We There Yet? A Brief Survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges," IEEE Transactions on Affective Computing, vol. 16, no. 4, 2024.

[6] B. McFee et al., "librosa: Audio and Music Signal Analysis in Python," in Proceedings of the 14th Python in Science Conference, 2015 (Updated 2025).

[7] L. Wang et al., "Interpretable neural network based on an intermediate semantic bottleneck structure for music analysis," Multimedia Systems, vol. 31, no. 4, 2025.

[8] Y. Zhu et al., "A survey on music emotion recognition using learning models and Hierarchical Attention Mechanisms," International Journal of Multimedia Information Retrieval, vol. 14, no. 2, 2025.

[9] Spotify for Developers, "Web API Reference | Spotify for Developers," [Online]. Available: https://developer.spotify.com/documentation/web-api/. [Accessed: May 02, 2026].

[10] Kaggle, "Music Emotion Recognition Dataset," [Online]. Available: https://www.kaggle.com/datasets. [Accessed: May 01, 2026].
Ethics and Compliance
✓ All ethical standards met
This article has undergone plagiarism screening and double-blind peer review. Editorial policies have been followed. Authors retain copyright under CC BY-NC 4.0 license. The research complies with ethical standards and institutional guidelines.
Indexed In
Similar Articles
Policy to Practice: Regulatory Readiness for AI-Driven Healthcare Research in India
string(17) "Dr. Mukta Bhosale" Bhosale, D. M.
(2026)
DOI: 10.55041/ijsmt.v2i4.117
GREENSYNC: AI-Powered Smart Agriculture and Precision Farming Platform
string(15) "Upasana haldkar" haldkar, U.et al.
(2026)
DOI: 10.55041/ijsmt.v2i5.306
Women Empowerment Portal
string(9) "Gowtham S" S, G.
(2026)
DOI: 10.55041/ijsmt.v2i5.034
Scroll to Top