Multi-Task Learning in Natural Language Processing: An Overview,ACM Computing Surveys

当前位置： X-MOL 学术 › ACM Comput. Surv. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multi-Task Learning in Natural Language Processing: An Overview
ACM Computing Surveys ( IF 16.6 ) Pub Date : 2024-05-11 , DOI: 10.1145/3663363
Shijie Chen ₁ , Yu Zhang ₂ , Qiang Yang ₃

Affiliation

Deep learning approaches have achieved great success in the field of Natural Language Processing (NLP). However, directly training deep neural models often suffer from overfitting and data scarcity problems that are pervasive in NLP tasks. In recent years, Multi-Task Learning (MTL), which can leverage useful information of related tasks to achieve simultaneous performance improvement on these tasks, has been used to handle these problems. In this paper, we give an overview of the use of MTL in NLP tasks. We first review MTL architectures used in NLP tasks and categorize them into four classes, including parallel architecture, hierarchical architecture, modular architecture, and generative adversarial architecture. Then we present optimization techniques on loss construction, gradient regularization, data sampling, and task scheduling to properly train a multi-task model. After presenting applications of MTL in a variety of NLP tasks, we introduce some benchmark datasets. Finally, we make a conclusion and discuss several possible research directions in this field.

中文翻译：

自然语言处理中的多任务学习：概述

深度学习方法在自然语言处理（NLP）领域取得了巨大成功。然而，直接训练深度神经模型往往会遇到 NLP 任务中普遍存在的过度拟合和数据稀缺问题。近年来，多任务学习（MTL）已被用来处理这些问题，它可以利用相关任务的有用信息来实现这些任务的同时性能提高。在本文中，我们概述了 MTL 在 NLP 任务中的使用。我们首先回顾了 NLP 任务中使用的 MTL 架构，并将其分为四类，包括并行架构、分层架构、模块化架构和生成对抗架构。然后，我们提出了损失构造、梯度正则化、数据采样和任务调度的优化技术，以正确训练多任务模型。在介绍了 MTL 在各种 NLP 任务中的应用之后，我们介绍了一些基准数据集。最后，我们做出结论并讨论该领域的几个可能的研究方向。

更新日期：2024-05-11

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>