Academy & Industry Research Collaboration Center (AIRCC)

Volume 9, Number 12, September 2019

Including Natural Language Processing and Machine Learning into Information Retrieval

  Authors

Piotr Malak and Artur Ogurek, University of Wrocław, Poland

  Abstract

In current paper we discuss the results of preliminary, but promising, research on including some Natural Language Processing (NLP) and Machine Learning (ML) approaches into Information Retrieval. Classical IR uses indexing and term weighting in order to increase pertinence of answers given to users queries. Such approach allows for matching the meaning, i.e. matching all keywords of the same or very similar meaning as expressed in user query. For most cases this approach is sufficient enough to fulfil user information needs.

However indexing and retrieving information over professional language texts brings new challenges as well as new possibilities. One of challenges is different grammar, causing the need of adjusting NLP tools for a given professiolect. One of the possibilities is detecting the context of occurrence of indexed term in the text.

In our research we made an attempt to answer the question whether Natural Language Processing approach combined with supervised Machine Learning is capable of detecting contextual features of professional language texts.

  Keywords

Enhanced Information Retrieval, Contextual IR, NLP, Machine Learning