Academy & Industry Research Collaboration Center (AIRCC)

Volume 11, Number 07, May 2021

Fenix: A Semantic Search Engine Based on an Ontology and a Model Trained
with Machine Learning to Support Research

  Authors

Felipe Cujar-Rosero, David Santiago Pinchao Ortiz, Silvio Ricardo Timaran Pereira and Jimmy Mateo Guerrero Restrepo, University of Nariño, Colombia

  Abstract

This paper presents the final results of the research project that aimed to build a Semantic Search Engine that uses an Ontology and a model trained with Machine Learning to support the semantic search of research projects of the System of Research from the University of Nariño. For the construction of FENIX, as this Engine is called, it was used a methodology that includes the stages: appropriation of knowledge, installation and configuration of tools, libraries and technologies, collection, extraction and preparation of research projects, design and development of the Semantic Search Engine. The main results of the work were three: a) the complete construction of the Ontology with classes, object properties (predicates), data properties (attributes) and individuals (instances) in Protegé, SPARQL queries with Apache Jena Fuseki and the respective coding with Owlready2 using Jupyter Notebook with Python within the virtual environment of anaconda; b) the successful training of the model for which Machine Learning algorithms and specifically Natural Language Processing algorithms were used such as: SpaCy, NLTK, Word2vec and Doc2vec, this was also done in Jupyter Notebook with Python within the virtual environment of anaconda and with Elasticsearch; and c) the creation of FENIX managing and unifying the queries for the Ontology and for the Machine Learning model. The tests showed that FENIX was successful in all the searches that were carried out because its results were satisfactory.

  Keywords

Search Engine, Semantic Web, Ontology, Machine Learning, Natural Language Processing.