Volume 10, Number 2

Extractive Summarization with Very Deep Pretrained Language Model


Yang Gu1 and Yanke Hu2, 1Suning USA, USA and 2Humana, USA


Recent development of generative pretrained language models has been proven very successful on a wide range of NLP tasks, such as text classification, question answering, textual entailment and so on.In this work, we present a two-phase encoder decoder architecture based on Bidirectional Encoding Representation from Transformers(BERT) for extractive summarization task. We evaluated our model by both automatic metrics and human annotators, and demonstrated that the architecture achieves the state-of-the-art comparable result on large scale corpus - CNN/Daily Mail1. As the best of our knowledge, this is the first work that applies BERT based architecture to a text summarization task and achieved the state-of-the-art comparable result.


BERT, AI, Deep Learning, Summarization