Volume 8, Number 5

Auto Correction of Setswana Real-word Errors

  Authors

Gabofetswe Malema, Boago Okgetheng, Moffat Motlhanka and Goaletsa Rammidi, University of Botswana, Botswana

  Abstract

Spell checkers are used to detect and where possible correct spelling errors. Errors are classified as nonword errors and real-word errors. Real-word errors require the consideration of the context of the sentence to detect and correct. Setswana language has several commonly used words which are often misspelled by either separating or merging them. The misspelling results in real-word errors. In this paper we propose contextual rules that look at neighbor words to determine whether the correct word is written as two separate words or merged as one word. For some words the rules require that the parts of speech category of neighbor words be determined whereas some depend on specific neighbor words or position in a sentence. Implemented rules show that the rules are very consistent with a 88% success rate. Our tool only looks at neighbor words and therefore does not look at the context of the whole sentence. Hence, for words that require context of the whole sentence to disambiguate correctly our rules fail. This module can be incorporated into a spell checker to detect and correct real world errors for some words. That is, help users to determine the correct orthography of certain words.

  Keywords

Spell checker, real-word errors, dictionary