A MULTILINGUAL DOCUMENT SUMMARIZATION AND DEADLINE DETECTION FOR EDUCATIONAL INSTITUTIONS USING OPENAI
Educational institutes receives many documents from various authorities, most of the times these are either as text-based PDFs or scanned images and are commonly written in English or regional languages. Understanding such documents can be challenging due to their length, unstructured format, and official language complexity. Many a times, users often miss important deadlines embedded within these documents as they are in pdf formats are never revisited again. The proposed system automatically generates concise summaries and extracts critical dates and deadlines using the OpenAI GPT-4o-mini large language model (LLM). The system employs a hybrid processing strategy that dynamically selects between text-based and vision-based inputs to optimize accuracy and token cost, thus increasing the accessibility to key information in the official documents.
Alavani, C. (2026). A Multilingual Document Summarization and Deadline Detection for Educational Institutions using Openai. International Journal of Science, Strategic Management and Technology, 02(04). https://doi.org/10.55041/ijsmt.v2i4.122
Alavani, Chitra. "A Multilingual Document Summarization and Deadline Detection for Educational Institutions using Openai." International Journal of Science, Strategic Management and Technology, vol. 02, no. 04, 2026, pp. . doi:https://doi.org/10.55041/ijsmt.v2i4.122.
Alavani, Chitra. "A Multilingual Document Summarization and Deadline Detection for Educational Institutions using Openai." International Journal of Science, Strategic Management and Technology 02, no. 04 (2026). https://doi.org/https://doi.org/10.55041/ijsmt.v2i4.122.
2.OpenAI. (2023). GPT-4 technical report. OpenAI.
3.OpenAI. (2024). GPT-4o and GPT-4o mini system card. OpenAI.
4.Smith, R. (2007). An overview of the Tesseract OCR engine. Proceedings of the International Conference on Document Analysis and Recognition (ICDAR).
5.Zhou, Y., et al. (2022). Least-to-most prompting enables complex reasoning in large language models. Advances in Neural Information Processing Systems (NeurIPS).