Home Articles Abstractive summarization for the Ukrainian language: Multi-task learning with hromadske. ua news dataset

Abstractive summarization for the Ukrainian language: Multi-task learning with hromadske. ua news dataset

Reference

Galeshchuk S. (2023). Abstractive summarization for the Ukrainian language: Multi-task learning with hromadske. ua news dataset. Proceedings of the Second Ukrainian Natural Language Processing Workshop (UNLP). pp.49-53

Agency Expert(s) related to the Article

Dr. Svitlana Galeshchuk

Abstract

Despite recent NLP developments, abstractive summarization remains a challenging task, especially in the case of low-resource languages like Ukrainian. The paper aims at improving the quality of summaries produced by mT5 for news in Ukrainian by fine-tuning the model with a mixture of summarization and text similarity tasks using summary-article and title-article training pairs, respectively. The proposed training set-up with small, base, and large mT5 models produce higher quality résumé. Besides, we present a new Ukrainian dataset for the abstractive summarization task that consists of circa 36.5K articles collected from Hromadske.ua until June 2021.