Fine-tuning a general transformer model on story-lines of IMDB movies database
dc.contributor.author | Ghasemi, Hojat | |
dc.date.accessioned | 2023-01-25T16:13:56Z | |
dc.date.available | 2023-01-25T16:13:56Z | |
dc.date.issued | 2022-01-13 | |
dc.description.abstract | Recent transformer-based language models pre-trained on huge text corpora have shown great success in performing downstream Natural Language Processing (NLP) tasks such as text summarization when fine-tuned on smaller labeled datasets. However, the impact of fine-tuning on improving the performance of pre-trained language models in summarizing movie storylines have not been explored. Moreover, there is a lack of extensive labelled datasets containing movies storylines to allow pre-trained language models delving deeper in this realm. In this research work we propose a novel labelled dataset containing IMDB movie storylines alongside their summaries for teaching pre-trained language models how to perform text summarization on movie storylines. Furthermore, we showcase the potential of this dataset by fine-tuning a T5-base model with the use of this dataset. Our results show that fine-tuning a T5-base model on this dataset can significantly improve the performance in summarizing movie storylines | en_US |
dc.description.degree | Master of Science (M.Sc.) in Computational Sciences | en_US |
dc.identifier.uri | https://laurentian.scholaris.ca/handle/10219/3980 | |
dc.language.iso | en | en_US |
dc.publisher.grantor | Laurentian University of Sudbury | en_US |
dc.subject | Fine-tuning | en_US |
dc.subject | language model | en_US |
dc.subject | text summarization | en_US |
dc.subject | transfer learning | en_US |
dc.subject | transformer | en_US |
dc.subject | pre-training | en_US |
dc.subject | abstractive summarization | en_US |
dc.subject | IMDB | en_US |
dc.subject | movie storyline | en_US |
dc.subject | natural language processing | en_US |
dc.subject | attention based models | en_US |
dc.subject | deep learning | en_US |
dc.title | Fine-tuning a general transformer model on story-lines of IMDB movies database | en_US |
dc.type | Thesis | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Thesis FINAL_Hojat Ghasemi_ 16-Feb-2022.pdf
- Size:
- 1.79 MB
- Format:
- Adobe Portable Document Format
- Description:
License bundle
1 - 1 of 1
No Thumbnail Available
- Name:
- license.txt
- Size:
- 6.52 KB
- Format:
- Item-specific license agreed upon to submission
- Description: