Volume 9, Number 12, September 2019
Including Natural Language Processing and Machine Learning into Information Retrieval
Authors
Piotr Malak and Artur Ogurek, University of Wrocław, Poland
Abstract
In current paper we discuss the results of preliminary, but promising, research on including
some Natural Language Processing (NLP) and Machine Learning (ML) approaches into
Information Retrieval. Classical IR uses indexing and term weighting in order to increase
pertinence of answers given to users queries. Such approach allows for matching the meaning,
i.e. matching all keywords of the same or very similar meaning as expressed in user query. For
most cases this approach is sufficient enough to fulfil user information needs.
However indexing and retrieving information over professional language texts brings new
challenges as well as new possibilities. One of challenges is different grammar, causing the
need of adjusting NLP tools for a given professiolect. One of the possibilities is detecting the
context of occurrence of indexed term in the text.
In our research we made an attempt to answer the question whether Natural Language
Processing approach combined with supervised Machine Learning is capable of detecting
contextual features of professional language texts.
Keywords
Enhanced Information Retrieval, Contextual IR, NLP, Machine Learning