Vision-Based Context Understanding using Multimodal AI

Gupta, Rudra; Khalid, Abdul

doi:https://doi.org/10.55041/ijsmt.v2i5.160

Plagiarism Passed

Peer reviewed

Open Access

VISION-BASED CONTEXT UNDERSTANDING USING MULTIMODAL AI

AUTHORS:

Rudra Gupta

Abdul Khalid

Mentor

Affiliation

B.Tech (Information Technology) NIET, Greater Noida

DOI: 10.55041/ijsmt.v2i5.160

CC BY 4.0 License:

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

DOWNLOAD ARTICLE

REVIEW REPORT

CITE THIS ARTICLE

Abstract

Understanding an image in a meaningful way re- quires more than identifying isolated objects. A useful description of a scene must also capture actions, relations, setting, intent, and often a coarse sense of social or emotional context. This ability, which humans perform naturally, remains a difficult

Keywords

Article Information

Article Metrics

Article Views

PDF Downloads

HOW TO CITE

References

Ethics and Compliance

✓ All ethical standards met

This article has undergone plagiarism screening and double-blind peer review. Editorial policies have been followed. Authors retain copyright under CC BY-NC 4.0 license. The research complies with ethical standards and institutional guidelines.

Indexed In