Roberta Goes for IPO: Prospectus Analysis with Language Models for Indian Initial Public Offerings

doi:10.5121/csit.2022.121905

Volume 12, Number 19, November 2022

Roberta Goes for IPO: Prospectus Analysis with Language Models for Indian Initial Public Offerings

Authors

Abhishek Mishra¹ and Yogendra Sisodia², ¹Trust Group, India, ²Conga, India

Abstract

With the advent of large-scale language models in natural language processing (NLP), extracting valuable information from financial documents has gained popularity among researchers, and deep learning has boosted the development of effective text mining models. Prospectus text mining is very important for the investor community to identify major risk factors and evaluate the usage of the amount to be raised during an IPO. In this paper, we investigate how the recently introduced pre-trained language model Roberta can be adapted for this task. We also introduced prospectus-specific sentence transformers for semantic textual similarity along with a dataset to verify the efficacy of our work.

Keywords

IPO, Prospectus, Large Language Models, Semantic Textual Similarity.

Subscription Membership AIRCC CSCP Contact Us
All Rights Reserved ® AIRCC