A REVIEW ON HUMAN STRESS AND ANXIETY DETECTION USING SPEECH SIGNALS AND DEEP LEARNING TECHNIQUES
Mental health disorders such as stress and anxiety have become major healthcare concerns worldwide. Early iden-tification of stress-related conditions is essential for preventing severe psychological and physiological complications. In recent years, speech-based emotion recognition systems have gained significant attention due to their non-invasive and real-time mon-itoring capability. This review paper presents a comprehensive survey of machine learning and deep learning techniques used for stress and anxiety detection from speech signals. Various acoustic feature extraction methods including Mel Frequency Cepstral Coefficients (MFCC), spectral contrast, chroma features, and zero-crossing rate are discussed. In addition, different classifi-cation models such as Random Forest, XGBoost, Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Residual Networks (ResNet) are reviewed and compared. Publicly available speech emotion datasets and recent advance-ments in deep learning-based emotional speech analysis are also summarized. The study highlights current challenges, research gaps, and future directions for intelligent speech-based mental health monitoring systems.
Kumari, S. (2026). A Review on Human Stress and Anxiety Detection using Speech Signals and Deep Learning Techniques. International Journal of Science, Strategic Management and Technology, 02(6). https://doi.org/10.55041/ijsmt.v2i6.097
Kumari, Swati. "A Review on Human Stress and Anxiety Detection using Speech Signals and Deep Learning Techniques." International Journal of Science, Strategic Management and Technology, vol. 02, no. 6, 2026, pp. . doi:https://doi.org/10.55041/ijsmt.v2i6.097.
Kumari, Swati. "A Review on Human Stress and Anxiety Detection using Speech Signals and Deep Learning Techniques." International Journal of Science, Strategic Management and Technology 02, no. 6 (2026). https://doi.org/https://doi.org/10.55041/ijsmt.v2i6.097.
2.. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.
3.Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.
4.Jurafsky and J. Martin, Speech and Language Processing. Pearson, 2021.
5.Schuller et al., “Speech Emotion Recognition Using Deep Neural Networks,” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 90–102, 2020.
6.Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
7.Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in NIPS, 2012.
8.He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in CVPR, 2016.
9.Lecun, Y. Bengio, and G. Hinton, “Deep Learning,” Nature, vol. 521,