IJSMT Journal

International Journal of Science, Strategic Management and Technology

An International, Peer-Reviewed, Open Access Scholarly Journal Indexed in recognized academic databases · DOI via Crossref The journal adheres to established scholarly publishing, peer-review, and research ethics guidelines set by the UGC

ISSN: 3108-1762 (Online)
webp (1)

Plagiarism Passed
Peer reviewed
Open Access

ENHANCING IMAGE CAPTIONING THROUGH AUGMENTED VISUAL COMPREHENSION WITH CNN

AUTHORS:
R.L.Pavan Kumar , P.Nithin Sai ,G .Surendra
Mentor
T.V.D.S.Sreyanth
Affiliation
Department of AI & DS , Koneru Lakshmaiah Education Foundation
CC BY 4.0 License:
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Deep Learning and Computer Vision technologies are expanding quickly, and the challenge of automatically generating informative photo captions has received considerable attention. As discoveries continue to reshape the artificial intelligence landscape, the demand for intelligent systems capable of contextualizing visual content with descriptive captions is growing. Image Captioning is a fascinating area of research that intersects computer vision and deep learning techniques. This research paper explores the application of deep learning to the task of generating descriptive captions for images. The proposed model is extended to integrate YOLO-based object detection which is incorporated into the feature extraction process, thus increasing the robustness of the image representation. The architecture includes the integration of Convolutional Neural Networks (LSTM) for feature extraction from images and RNNs for language modeling. The CNN extracts meaningful visual features from images. Attention methods are used to address the issue of matching linguistic and visual information. This enables the model to concentrate on distinct areas of the image while generating captions.
Keywords
CNN LSTM YOLO BLEU
Article Metrics
Article Views
25
PDF Downloads
0
HOW TO CITE
APA

MLA

Chicago

Copy

.Surendra, R. K. ,. P. S. ,. (2026). Enhancing Image Captioning Through Augmented Visual Comprehension with CNN. International Journal of Science, Strategic Management and Technology, Volume 10(01). https://doi.org/10.55041/ijsmt.v2i2.044

.Surendra, R.L.Pavan. "Enhancing Image Captioning Through Augmented Visual Comprehension with CNN." International Journal of Science, Strategic Management and Technology, vol. Volume 10, no. 01, 2026, pp. . doi:https://doi.org/10.55041/ijsmt.v2i2.044.

.Surendra, R.L.Pavan. "Enhancing Image Captioning Through Augmented Visual Comprehension with CNN." International Journal of Science, Strategic Management and Technology Volume 10, no. 01 (2026). https://doi.org/https://doi.org/10.55041/ijsmt.v2i2.044.

References

  1. Grishma Sharma, Priyanka Kalena, Nishi Malde, Aromal Nair, Saurabh Parkar, Visual Image Caption Generator Using Deep Learning written on April 8, 2019, ResearchGate

  2. Janvi Jambhale, Shreeya Sangale, Aarti Avhad, Payal Vairagade, Jameer Kotwal, Image Caption Generator using Convolutional Neural Networks and Long Short-Term          Memory, Issue:05/May-2022, IRJMETS

  3. Muhammad Abdelhadie Al-Malla, Assef Jafar & Nada Ghneim, Image captioning model using attention and object features to mimic human image understanding, Published:14 February 2022, Springer.

  4. Akash Verma, Arun Kumar Yadav, Mohit Kumar, Divakar Yadav, Automatic Image Caption Generation using Deep Learning, Posted Date: June 21st 2022, ResearchSquare

  5. Megha J Panicker, Vikas Upadhayay, Gunjan Sethi, Vrinda Mathur, Image Caption Generator, Volume-10 Isuue-3, January 2021, IJITEE

  6. Shuang Liu, Liang Bai, Yanil Hu, Haoran Wang, Image Captioning Based on Deep Neural Networks, November 2018, ResearchGate

  7. Antonio M. Rinaldi, Cristiano Russo, Cristian Tommasino, Automatic image captioning combining natural language processing and deep neural networks, Volume 18, June 2023, ScienceDirect

  8. Taraneh Ghandi, Hamidreza Pourreza, Hamidreza Mahyar, Deep Learning Approaches on Image Captioning: A Review, August 23, 2023, ARXIV

  9. Simao Herdade, Armin Kappeler, Kofi Boakye, Joao Soares, Image Captioning: Transforming Objects into Words, Yahoo Research

  10. Aishwarya Maroju, Sneha Sri Doma, Lahari Chadarlapati, Image Caption Generating Deep Learning Model, Vol. 10 Issue 09, September-2021, IJERT

Ethics and Compliance
✓ All ethical standards met
This article has undergone plagiarism screening and double-blind peer review. Editorial policies have been followed. Authors retain copyright under CC BY-NC 4.0 license. The research complies with ethical standards and institutional guidelines.
Indexed In
Similar Articles
A Study on Students' Perception Towards Professional Courses(CA/CS/CMA) with Reference to Coimbatore City
string(10) "Dr.T.Mohan" Dr.T.Mohan,
(2026)
DOI: 10.55041/ijsmt.v2i2.048
A Hybrid Posture Detection Framework: Integrating Machine Learning and Deep Neural Networks
string(56) "N. S. Prathap , N. Jahnavi , N. S. S. Revanya , U. Akash" Akash, N. S. P. ,. N. J. ,. N. S. S. R. ,. U.
(2026)
DOI: 10.55041/ijsmt.v2i2.122
Ethics, Risk Assessment, and Standardization in Nanotechnology
string(17) "Surendra K Pandey" Pandey, S. K.
(2026)
DOI: 10.55041/ijsmt.v2i2.008
Scroll to Top