International Journal of Innovative Approaches in Science Research

Volume 10 (2026)

Issue Information

Issue Information | International Journal of Innovative Approaches in Science Research Volume 10 (2026)

Issue Information

pp. i - vi | DOI: 10.29329/ijiasr.2026.1442

Abstract HTML (0) PDF (0)

Abstract

Keywords:

Original Articles

Research article | International Journal of Innovative Approaches in Science Research Volume 10 (2026)

Evaluating Large Language Models for Emotion Classification and Contextual Word Prediction in NLP

Murat Eser

pp. 1 - 16 | DOI: 10.29329/ijiasr.2026.1442.1

Abstract HTML (1) PDF (2)

Abstract

Large Language Models (LLMs) are powerful deep learning models capable of understanding, interpreting, and generating natural language with high accuracy. These models significantly facilitate the processing of large-scale and complex data, providing substantial ease and efficiency in various natural language processing (NLP) tasks, including text classification, sentiment analysis, contextual understanding, and automatic content generation. This research was conducted to evaluate the sentiment analysis and contextual understanding capabilities of LLMs in the NLP domain. The study contributes to the literature by providing an integrated evaluation that assesses LLMs not only for their generative and classification capabilities but also for their ability to maintain and predict semantic integrity. A balanced dataset consisting of five emotion categories was used to test classification and fill-in-the-blank tasks with ChatGPT 5.3, Claude Sonnet 4.6, and Gemini 3.1 Pro models using the zero-shot method. In the classification task, model performance was evaluated using accuracy, precision, recall, and F1-score metrics. The results revealed that the Claude Sonnet 4.6 model demonstrated superior performance by achieving a 99.52% accuracy score. In the fill-in-the-blank task, the semantic similarity between the predicted words, the original words, and the completed sentences were measured using SBERT and cosine similarity. In this task, Gemini 3.1 Pro achieved the highest similarity performance with scores of 0.85 for word similarity and 0.94 for sentence similarity. The findings indicate that the examined LLMs generally exhibited high success in emotion classification and contextual word prediction tasks. Particularly, Sonnet 4.6 performed stronger in classification, while Gemini 3.1 Pro showed greater strength in semantic fill-in-the-blank tasks. These results highlight the potential of LLMs in understanding and completing emotion-bearing texts in everyday language, thereby underscoring their importance in NLP research.

Keywords: Large Language Models, Natural Language Processing, Classification, Semantic Similarity Analysis, Sentiment Classification