- Annepaka, Y., & Pakray, P. (2025). Large language models: a survey of their development, capabilities, and applications. Knowledge and Information Systems, 67(3), 2967–3022. [Google Scholar]
- Chang, Y., Wang, X., Wang, J., Wu, Y., Yang, L., Zhu, K., Chen, H., Yi, X., Wang, C., & Wang, Y. (2024). A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology, 15(3), 1–45. [Google Scholar]
- Feng, H., Ronzano, F., LaFleur, J., Garber, M., De Oliveira, R., Rough, K., Roth, K., Nanavati, J., El Abidine, K. Z., & Mack, C. (2024). Evaluation of large language model performance on the biomedical language understanding and reasoning benchmark: Comparative study. MedRxiv, 2024–2025. [Google Scholar]
- Jiang, P., Xiao, C., Wang, Z., Bhatia, P., Sun, J., & Han, J. (2024). Trisum: Learning summarization ability from large language models with structured rationale. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2805–2819. [Google Scholar]
- Lahitani, A. R., Permanasari, A. E., & Setiawan, N. A. (2016). Cosine similarity to determine similarity measure: Study case in online essay assessment. 2016 4th International Conference on Cyber and IT Service Management, 1–6. [Google Scholar]
- Leon, M. (2024). Benchmarking large language models with a unified performance ranking metric. International Journal in Foundations of Computer Science & Technology, 4. [Google Scholar]
- Li, J., Bian, Y., Wang, G., Lei, Y., Cheng, D., Ding, Z., & Jiang, C. (2023). Cfgpt: Chinese financial assistant with large language model. ArXiv Preprint ArXiv:2309.10654. [Google Scholar]
- Ma, C., Wu, Z., Wang, J., Xu, S., Wei, Y., Liu, Z., Zeng, F., Jiang, X., Guo, L., & Cai, X. (2024). An iterative optimizing framework for radiology report summarization with ChatGPT. IEEE Transactions on Artificial Intelligence, 5(8), 4163–4175. [Google Scholar]
- Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. ArXiv Preprint ArXiv:1301.3781. [Google Scholar]
- Min, B., Ross, H., Sulem, E., Veyseh, A. P. Ben, Nguyen, T. H., Sainz, O., Agirre, E., Heintz, I., & Roth, D. (2023). Recent advances in natural language processing via large pre-trained language models: A survey. ACM Computing Surveys, 56(2), 1–40. [Google Scholar]
- Naveed, H., Khan, A. U., Qiu, S., Saqib, M., Anwar, S., Usman, M., Akhtar, N., Barnes, N., & Mian, A. (2025). A comprehensive overview of large language models. ACM Transactions on Intelligent Systems and Technology, 16(5), 1–72. [Google Scholar]
- Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. [Google Scholar]
- Raiaan, M. A. K., Mukta, M. S. H., Fatema, K., Fahad, N. M., Sakib, S., Mim, M. M. J., Ahmad, J., Ali, M. E., & Azam, S. (2024). A review on large language models: Architectures, applications, taxonomies, open issues and challenges. IEEE Access, 12, 26839–26874. [Google Scholar]
- Shao, M., Basit, A., Karri, R., & Shafique, M. (2024). Survey of different large language model architectures: Trends, benchmarks, and challenges. IEEE Access, 12, 188664–188706. [Google Scholar]
- Sindhu, B., Prathamesh, R. P., Sameera, M. B., & KumaraSwamy, S. (2024). The evolution of large language model: Models, applications and challenges. 2024 International Conference on Current Trends in Advanced Computing (ICCTAC), 1–8. [Google Scholar]
- Song, X., Xie, K., Lee, L., Chen, R., Clark, J. M., He, H., He, H., Min, J., Zhang, X., & Zheng, S. (2025). Performance evaluation of large language models in statistical programming. ArXiv Preprint ArXiv:2502.13117. [Google Scholar]
- Su, C.-Y., & McMillan, C. (2024). Distilled GPT for source code summarization. Automated Software Engineering, 31(1), 22. [Google Scholar]
- Tan, E., & Liu, H. (2022). Performance Comparison of Seven Pretrained Models on a text classification task. Proceedings of the 2022 5th International Conference on Signal Processing and Machine Learning, 8–12. [Google Scholar]
- Tintin, R., & Yücebaş, S. C. (2026). Duygu-Turk: A Context-Aware Sentiment Analysis Framework for Turkish, Based on Plutchik’s Emotion Model. Journal of Universal Computer Science, 32(4). [Google Scholar]
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30. [Google Scholar]
- Venkatesh Sharma, K., Ayiluri, P. R., Betala, R., Jagdish Kumar, P., & Shirisha Reddy, K. (2024). Enhancing query relevance: leveraging SBERT and cosine similarity for optimal information retrieval. International Journal of Speech Technology, 27(3), 753–763. [Google Scholar]
- Veziroğlu, M., & Bucak, İ. (2025). Haber Sınıflandırma Sistemlerinde Naive Bayes ve Makine Öğrenmesi Algoritmaları Arasında Performans Karşılaştırması. Journal of the Institute of Science and Technology, 15(1), 57–70. [Google Scholar]
- Xu, H., & Ashley, K. (2023). Argumentative segmentation enhancement for legal summarization. ArXiv Preprint ArXiv:2307.05081. [Google Scholar]
- Zhu, J., Li, J., Wen, Y., & Guo, L. (2024). Benchmarking large language models on CFLUE-a Chinese financial language understanding evaluation dataset. Findings of the Association for Computational Linguistics: ACL 2024, 5673–5693. [Google Scholar]
|