Fine-tuning a general transformer model on story-lines of IMDB movies database

dc.contributor.authorGhasemi, Hojat
dc.date.accessioned2023-01-25T16:13:56Z
dc.date.available2023-01-25T16:13:56Z
dc.date.issued2022-01-13
dc.description.abstractRecent transformer-based language models pre-trained on huge text corpora have shown great success in performing downstream Natural Language Processing (NLP) tasks such as text summarization when fine-tuned on smaller labeled datasets. However, the impact of fine-tuning on improving the performance of pre-trained language models in summarizing movie storylines have not been explored. Moreover, there is a lack of extensive labelled datasets containing movies storylines to allow pre-trained language models delving deeper in this realm. In this research work we propose a novel labelled dataset containing IMDB movie storylines alongside their summaries for teaching pre-trained language models how to perform text summarization on movie storylines. Furthermore, we showcase the potential of this dataset by fine-tuning a T5-base model with the use of this dataset. Our results show that fine-tuning a T5-base model on this dataset can significantly improve the performance in summarizing movie storylinesen_US
dc.description.degreeMaster of Science (M.Sc.) in Computational Sciencesen_US
dc.identifier.urihttps://laurentian.scholaris.ca/handle/10219/3980
dc.language.isoenen_US
dc.publisher.grantorLaurentian University of Sudburyen_US
dc.subjectFine-tuningen_US
dc.subjectlanguage modelen_US
dc.subjecttext summarizationen_US
dc.subjecttransfer learningen_US
dc.subjecttransformeren_US
dc.subjectpre-trainingen_US
dc.subjectabstractive summarizationen_US
dc.subjectIMDBen_US
dc.subjectmovie storylineen_US
dc.subjectnatural language processingen_US
dc.subjectattention based modelsen_US
dc.subjectdeep learningen_US
dc.titleFine-tuning a general transformer model on story-lines of IMDB movies databaseen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Thesis FINAL_Hojat Ghasemi_ 16-Feb-2022.pdf
Size:
1.79 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.52 KB
Format:
Item-specific license agreed upon to submission
Description: