Evaluating artificial intelligence large language models’ performances in a South African high school chemistry exam
Samuel Jere 1 *
More Detail
1 University of Venda, Thohoyandou, SOUTH AFRICA* Corresponding Author

Abstract

Gemini, ChatGPT Plus, and Claude 3.5 Sonnet are artificial intelligence (AI) chatbots with potential in education. Their capabilities, such as acting as virtual teaching assistants, offering personalized responses to learners’ queries, and summarizing content, make them versatile tools with the potential to assist learners. The chemistry section of physical sciences in South Africa is often considered challenging, and learners could benefit from virtual teaching assistants to supplement traditional instruction. However, little is known about AI chatbots’ abilities in solving high school chemistry problems. This descriptive case study examined the capabilities of Gemini, Claude 3.5 Sonnet, and ChatGPT Plus in accurately answering questions from the final grade 12 physical sciences chemistry exam in South Africa. The conceptual framework that guided the study was Bloom’s taxonomy of educational objectives. The responses were rigorously evaluated using the same criteria and rubrics applied to the candidates that year, ensuring a fair and robust comparison. The findings were that ChatGPT Plus performed at 47%, Gemini at 51% and Claude 3.5 Sonnet at 65%. All chatbots performed above the average performance of the candidates who sat for the paper that year, which was 46%. This has significant implications for policymakers, teachers, and learners regarding integrating large language models in teaching physical sciences and exam preparation.

License

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Article Type: Research Article

EURASIA J Math Sci Tech Ed, Volume 21, Issue 2, February 2025, Article No: em2582

https://doi.org/10.29333/ejmste/15932

Publication date: 07 Feb 2025

Article Views: 279

Article Downloads: 116

Open Access References How to cite this article