Fine-tuning a general transformer model on story-lines of IMDB movies database
Date
2022-01-13
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Recent transformer-based language models pre-trained on huge text corpora have shown great
success in performing downstream Natural Language Processing (NLP) tasks such as text
summarization when fine-tuned on smaller labeled datasets. However, the impact of fine-tuning
on improving the performance of pre-trained language models in summarizing movie storylines
have not been explored. Moreover, there is a lack of extensive labelled datasets containing movies
storylines to allow pre-trained language models delving deeper in this realm. In this research work
we propose a novel labelled dataset containing IMDB movie storylines alongside their summaries
for teaching pre-trained language models how to perform text summarization on movie storylines.
Furthermore, we showcase the potential of this dataset by fine-tuning a T5-base model with the
use of this dataset. Our results show that fine-tuning a T5-base model on this dataset can
significantly improve the performance in summarizing movie storylines
Description
Keywords
Fine-tuning, language model, text summarization, transfer learning, transformer, pre-training, abstractive summarization, IMDB, movie storyline, natural language processing, attention based models, deep learning