IJSMT Journal

International Journal of Science, Strategic Management and Technology

An International, Peer-Reviewed, Open Access Scholarly Journal Indexed in recognized academic databases · DOI via Crossref The journal adheres to established scholarly publishing, peer-review, and research ethics guidelines set by the UGC

ISSN: 3108-1762 (Online)
webp (1)

Plagiarism Passed
Peer reviewed
Open Access

SMART TAMIL: A DIALECT-AWARE SMALL LANGUAGE MODEL FOR TAMIL NLP

AUTHORS:
Gokul K
Kishore Kumar R
Tholkappiyan R
Mentor
Dr. R PADMA
Affiliation

Department of CS & IT

CC BY 4.0 License:
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract

Smart Tamil, designed to be a dialect-sensitive language system, aims to embrace the diversity and richness of Tamil as spoken in the dialects of Tamil Nadu. Most language systems do not take dialect differences explicitly during large-scale deployments and the output language is grammatically correct but evocative of a non- idiomatic language. To solve this, the Small Language Model (SLM) of Smart Tamil will be trained on small- corpus spoken data, heterogeneous written data, and video data to capture the language and dialect variations and spoken styles of the five major dialect zones of Tamil Nadu including Kongu Tamil (Coimbatore/Erode), Nellai Tamil (Tirunelveli/Thoothukudi), Kanyakumari Tamil, the Central Trichy/Thanjavur, and Urban Tamil of Chennai. The Smart Tamil System has been built as a full stack React + Flask application with the inbuilt ability for speech synthesis, and speech recognition through the Web Speech API.

Keywords
Article Metrics
Article Views
60
PDF Downloads
2
HOW TO CITE
APA

MLA

Chicago

Copy

K, G., R, K. K. & R, T. (2026). Smart Tamil: A Dialect-Aware Small Language Model for Tamil NLP. International Journal of Science, Strategic Management and Technology, 02(05). https://doi.org/10.55041/ijsmt.v2i5.046

K, Gokul, et al.. "Smart Tamil: A Dialect-Aware Small Language Model for Tamil NLP." International Journal of Science, Strategic Management and Technology, vol. 02, no. 05, 2026, pp. . doi:https://doi.org/10.55041/ijsmt.v2i5.046.

K, Gokul,Kishore R, and Tholkappiyan R. "Smart Tamil: A Dialect-Aware Small Language Model for Tamil NLP." International Journal of Science, Strategic Management and Technology 02, no. 05 (2026). https://doi.org/https://doi.org/10.55041/ijsmt.v2i5.046.

References
1.Vaswani et al., "Attention is all you need," in Advances in Neural Information Processing Systems, vol. 30, pp. 5998– 6008, 2017.

2.Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding," in Proc. NAACL-HLT, pp. 4171–4186, 2019.

3.Brown et al., "Language models are few-shot learners," in Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901, 2020.

4.Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.

5.Joshi et al., "The state and fate of linguistic diversity and inclusion in the NLP world," in Proc. ACL 2020, pp. 6282– 6293.

6.Murugesan et al., "Tamil NLP: Challenges, datasets, and deep learning approaches," Journal of Intelligent Systems, vol. 29, no. 1, pp. 498–509, 2020.

7.Krishnamurthy, "A study of dialects in Tamil Nadu: Sociolinguistic perspectives," Indian Linguistics, vol. 70, no. 1–4,1–25, 2009.

8.Raj and S. Thomas, "Low-resource dialect adaptation using transfer learning for Dravidian languages," in Proc. ACL 2021.

9.Soundararajan and B. Raju, "Regional dialect identification in Tamil using phonological features," ACM Trans. Asian Low-Resource Language Inf. Process., vol. 21, no. 4, pp. 1–20, 2022.

10.Ghosh and R. Bhatt, "Code-mixing in South Asian languages," in Proc. ACL Workshop on Code-Switching, 2021.
Ethics and Compliance
✓ All ethical standards met
This article has undergone plagiarism screening and double-blind peer review. Editorial policies have been followed. Authors retain copyright under CC BY-NC 4.0 license. The research complies with ethical standards and institutional guidelines.
Indexed In
Similar Articles
A Study on Patient Experience with Optimum Resource Utilization in Multi Speciality Hospital
string(11) "DHAKSHIKA V" V, D.
(2026)
DOI: 10.55041/ijsmt.v2i3.437
Automatic Water Filling Station using Plc Allen Bradley 1400
string(15) "Turai Manjunath" Manjunath, T.et al.
(2026)
DOI: 10.55041/ijsmt.v2i4.547
An Empirical Study of Price Action, Return, Volatility and Risk in MCX vs International Exchange: A Comparative Analysis of Gold, Silver, Oil, Gas and Copper
string(30) "S. Subhadharsini,Dr.K.Rajamani" Subhadharsini,Dr.K.Rajamani, S.
(2026)
DOI: 10.55041/ijsmt.v2i4.020
Scroll to Top