Examining Accuracy Heterogeneities in Classification of Multilingual

doi:10.5121/csit.2023.131221

Examining Accuracy Heterogeneities in Classification of Multilingual

Authors

Raghav Subramaniam, Researcher, USA

Abstract

Tools for detection of AI-generated texts are used globally, however, the nature of the apparent accuracy disparities between languages must be further observed. This paper aims to examine the nature of these differences through testing OpenAI’s “AI Text Classifier” on a set of various AI and human-generated texts in English, Swahili, German, Arabic, Chinese, and Hindi. Current tools for detecting AI-generated text are already fairly easy to discredit, as misclassifications have shown to be fairly common, but such vulnerabilities often persist in slightly different ways when non-English languages are observed: classification of human-written text as AI-generated and vice versa could occur more frequently in specific language environments than others. Our findings indicate that false positives are more likely to occur in Hindi and Arabic, whereas false negative labelings are more likely to occur in English. Other languages tested had a tendency to not be confidently labeled at all.

Keywords

Artificial Intelligence, Generative AI, AI Detection, Natural Language Processing, GPT

AIRCC

Examining Accuracy Heterogeneities in Classification of Multilingual

Authors

Abstract

Keywords