Volume 8, Number 1
Isolating Word Level Rules in Tamil Language for Efficient Development of Language Tools
Authors
Suriyah M, Aarthy Anandan, Anitha Narasimhan and Madhan Karky, Karky Research Foundation, India
Abstract
With the advent of social media, the amount of text available for processing across different natural languages has become enormous. In the past few decades, there has been tremendous increase in the number of language processing applications. The tools for natural language computing of various languages are very different because each language has its own set of grammatical rules. This paper focuses on identifying the basic inflectional principles of Tamil language at word level. Three levels of word inflection concepts are considered – Patterns, Rules and Exceptions. How grammatical principles for word inflections in Tamil can be grouped in these three levels and applied for obtaining different word forms is the focus of this paper. These can be made use of in a wide variety of natural language applications like morphological analysis, morphological generation, word level translation, spelling and grammar check, information extraction etc. The tools using these rules will account for faster operation and better implementation of Tamil grammatical rules referred from [ெதால்ெகாப்யம் | tholgaappiyam] and [ நன்ல் | nannool] in NLP applications.
Keywords
Natural language processing, Rule based approach, word level rules, Tamil tool, language tools