Transformer-Based Regression Models for Assessing Reading Passage Complexity: A Deep Learning Approach in Natural Language Processing

doi:10.5121/ijaia.2024.15504

Volume 15, Number 5

Transformer-Based Regression Models for Assessing Reading Passage Complexity: A Deep Learning Approach in Natural Language Processing

Authors

Harmanpreet Sidhu and Amr Abdel-Dayem, Laurentian University, Canada

Abstract

Natural Language Processing (NLP) is a vital area in deep learning, widely applied in tasks like text classification, virtual assistants, speech recognition, and autocorrect features in digital devices. It allows machines to understand and generate human language, enhancing user interactions with software. This paper presents a deep learning model using the Transformer architecture for a regression task to predict the complexity of reading passages based on text excerpts. By leveraging the Transformer’s capability to identify complex patterns in text, the model achieves a relative error rate of about 10%. The paper also examines how different architectural choices influence model performance, focusing on one-hot encoding and embeddings. While one-hot encoding provides a simple text representation, embeddings offer a richer, more nuanced understanding of word relationships. The findings highlight the significance of model design and data representation in optimizing NLP tasks, providing insights for future advancements in the field.

Keywords

Natural language processing, Transformer models, Regression models, word embeddings.