survey

Free Access

Just Accepted

Multi-Task Learning in Natural Language Processing: An Overview

Authors:
Shijie Chen

The Ohio State University, Columbus, United States

The Ohio State University, Columbus, United States

0009-0003-3053-1259
Search about this author

,
Yu Zhang

Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China

Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, China

0000-0003-1100-4835
Search about this author

,
Qiang Yang

Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong Hong Kong

Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong Hong Kong

0000-0001-5059-8360
Search about this author

Authors Info & Claims

ACM Computing SurveysAccepted on April 2024https://doi.org/10.1145/3663363

Online AM:11 May 2024Publication History

ACM Computing Surveys

Abstract

Deep learning approaches have achieved great success in the field of Natural Language Processing (NLP). However, directly training deep neural models often suffer from overfitting and data scarcity problems that are pervasive in NLP tasks. In recent years, Multi-Task Learning (MTL), which can leverage useful information of related tasks to achieve simultaneous performance improvement on these tasks, has been used to handle these problems. In this paper, we give an overview of the use of MTL in NLP tasks. We first review MTL architectures used in NLP tasks and categorize them into four classes, including parallel architecture, hierarchical architecture, modular architecture, and generative adversarial architecture. Then we present optimization techniques on loss construction, gradient regularization, data sampling, and task scheduling to properly train a multi-task model. After presenting applications of MTL in a variety of NLP tasks, we introduce some benchmark datasets. Finally, we make a conclusion and discuss several possible research directions in this field.

References

Sawsan Alqahtani, Ajay Mishra, and Mona Diab. 2020. A Multitask Learning Approach for Diacritic Restoration. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 8238–8247.Google ScholarCross Ref
Maryam Aminian, Mohammad Sadegh Rasooli, and Mona Diab. 2020. Mutlitask Learning for Cross-Lingual Transfer of Semantic Dependencies. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. arxiv:2004.14961Google Scholar
Akari Asai, Mohammadreza Salehi, Matthew Peters, and Hannaneh Hajishirzi. 2022. ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 6655–6672. https://doi.org/10.18653/v1/2022.emnlp-main.446Google ScholarCross Ref
Isabelle Augenstein and Anders Søgaard. 2017. Multi-Task Learning of Keyphrase Boundary Classification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 341–346.Google ScholarCross Ref
Laura Banarescu, Claire Bonial, Shu Cai, Madalina Georgescu, Kira Griffitt, Ulf Hermjakob, Kevin Knight, Philipp Koehn, Martha Palmer, and Nathan Schneider. 2013. Abstract Meaning Representation for Sembanking. In Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse. Association for Computational Linguistics, 178–186.Google Scholar
Marcel Bollmann and Anders Søgaard. 2016. Improving Historical Spelling Normalization with Bi-Directional LSTMs and Multi-Task Learning. In Proceedings of the 26th International Conference on Computational Linguistics. The COLING 2016 Organizing Committee, 131–139.Google Scholar
Chloé Braud, Barbara Plank, and Anders Søgaard. 2016. Multi-View and Multi-Task Training of RST Discourse Parsers. In Proceedings of the 26th International Conference on Computational Linguistics. 1903–1913.Google Scholar
Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 1877–1901. https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdfGoogle Scholar
Rich Caruana. 1997. Multitask Learning. Machine Learning 28, 1 (1997), 41–75.Google ScholarDigital Library
Christophe Cerisara, Somayeh Jafaritazehjani, Adedayo Oluokun, and Hoa T. Le. 2018. Multi-Task Dialog Act and Sentiment Recognition on Mastodon. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 745–754.Google Scholar
Shuaichen Chang, Pengfei Liu, Yun Tang, Jing Huang, Xiaodong He, and Bowen Zhou. 2020. Zero-Shot Text-to-SQL Learning with Auxiliary Task. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 7488–7495.Google ScholarCross Ref
Soravit Changpinyo, Hexiang Hu, and Fei Sha. 2018. Multi-Task Learning for Sequence Tagging: An Empirical Study. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 2965–2977.Google Scholar
Devendra Singh Chaplot, Lisa Lee, Ruslan Salakhutdinov, Devi Parikh, and Dhruv Batra. 2020. Embodied Multimodal Multitask Learning. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 2442–2448.Google ScholarCross Ref
Dushyant Singh Chauhan, Dhanush S R, Asif Ekbal, and Pushpak Bhattacharyya. 2020. Sentiment and Emotion Help Sarcasm? A Multi-Task Learning Framework for Multi-Modal Sarcasm, Sentiment and Emotion Analysis. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 4351–4360.Google ScholarCross Ref
Junkun Chen, Xipeng Qiu, Pengfei Liu, and Xuanjing Huang. 2018. Meta Multi-Task Learning for Sequence Modeling. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
Long Chen, Ziyu Guan, Wei Zhao, Wanqing Zhao, Xiaopeng Wang, Zhou Zhao, and Huan Sun. 2019. Answer Identification from Product Reviews for User Questions by Multi-Task Attentive Networks. Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 2019), 45–52.Google ScholarDigital Library
Liying Cheng, Lidong Bing, Qian Yu, Wei Lu, and Luo Si. 2020. APE: Argument Pair Extraction from Peer Review and Rebuttal via Multi-Task Learning. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 7000–7011.Google ScholarCross Ref
Shun-Po Chuang, Tzu-Wei Sung, Alexander H. Liu, and Hung-yi Lee. 2020. Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 5998–6003.Google ScholarCross Ref
Roberto Cipolla, Yarin Gal, and Alex Kendall. 2018. Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 7482–7491.Google ScholarCross Ref
Kevin Clark, Minh-Thang Luong, Urvashi Khandelwal, Christopher D. Manning, and Quoc V. Le. 2019. BAM! Born-Again Multi-Task Networks for Natural Language Understanding. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 5931–5937.Google ScholarCross Ref
Ronan Collobert and Jason Weston. 2008. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. In Proceedings of the 25th International Conference on Machine Learning (ICML ’08). Association for Computing Machinery, 160–167.Google ScholarDigital Library
Ronan Cummins, Meng Zhang, and Ted Briscoe. 2016. Constrained Multi-Task Learning for Automated Essay Scoring. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 789–799.Google ScholarCross Ref
Verna Dankers, Marek Rei, Martha Lewis, and Ekaterina Shutova. 2019. Modelling the Interplay of Metaphor and Emotion through Multitask Learning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 2218–2229.Google Scholar
José G. C. de Souza, Matteo Negri, Elisa Ricci, and Marco Turchi. 2015. Online Multitask Learning for Machine Translation Quality Estimation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 219–228.Google Scholar
Yang Deng, Yuexiang Xie, Yaliang Li, Min Yang, Nan Du, Wei Fan, Kai Lei, and Ying Shen. 2019. Multi-Task Learning with Multi-View Attention for Answer Selection and Knowledge Base Question Answering. Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 2019), 6318–6325.Google ScholarDigital Library
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 4171–4186.Google Scholar
Erik-Lân Do Dinh, Steffen Eger, and Iryna Gurevych. 2018. Killing Four Birds with Two Stones: Multi-Task Learning for Non-Literal Language Detection. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 1558–1569.Google Scholar
Tobias Domhan and Felix Hieber. 2017. Using Target-Side Monolingual Data for Neural Machine Translation through Multi-Task Learning. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1500–1505.Google ScholarCross Ref
Youmna Farag and Helen Yannakoudakis. 2019. Multi-Task Learning for Coherence Modeling. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 629–639.Google ScholarCross Ref
Murhaf Fares, Stephan Oepen, and Erik Velldal. 2018. Transfer and Multi-Task Learning for Noun–Noun Compound Interpretation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1488–1498.Google Scholar
Hongliang Fei, Shulong Tan, and Ping Li. 2019. Hierarchical Multi-Task Word Embedding Learning for Synonym Prediction. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 834–842.Google ScholarDigital Library
Christiane Fellbaum. 2010. WordNet. In Theory and Applications of Ontology: Computer Applications. Springer, 231–243.Google Scholar
Dan Flickinger, Yi Zhang, and Valia Kordoni. 2012. DeepBank. A Dynamically Annotated Treebank of the Wall Street Journal. In Proceedings of the 11th International Workshop on Treebanks and Linguistic Theories. 85–96.Google Scholar
Ze-Feng Gao, Peiyu Liu, Wayne Xin Zhao, Zhong-Yi Lu, and Ji-Rong Wen. 2022. Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language Models. In Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Gyeongju, Republic of Korea, 3263–3273. https://aclanthology.org/2022.coling-1.288Google Scholar
Yu Gong, Xusheng Luo, Yu Zhu, Wenwu Ou, Zhao Li, Muhua Zhu, Kenny Q. Zhu, Lu Duan, and Xi Chen. 2019. Deep Cascade Multi-Task Learning for Slot Filling in Online Shopping Assistant. Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 2019), 6465–6472.Google ScholarDigital Library
Ana Valeria Gonzalez, Maria Barrett, Rasmus Hvingelby, Kellie Webster, and Anders Søgaard. 2020. Type B Reflexivization as an Unambiguous Testbed for Multilingual Multi-Task Gender Bias. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. arxiv:2009.11982Google ScholarCross Ref
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, Vol. 27. Curran Associates, Inc.Google Scholar
Ananth Gottumukkala, Dheeru Dua, Sameer Singh, and Matt Gardner. 2020. Dynamic Sampling Strategies for Multi-Task Reading Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 920–924.Google ScholarCross Ref
Han Guo, Ramakanth Pasunuru, and Mohit Bansal. 2018. Dynamic Multi-Level Multi-Task Learning for Sentence Simplification. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 462–476.Google Scholar
Han Guo, Ramakanth Pasunuru, and Mohit Bansal. 2018. Soft Layer-Specific Multi-Task Summarization with Entailment and Question Generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 687–697.Google ScholarCross Ref
Divam Gupta, Tanmoy Chakraborty, and Soumen Chakrabarti. 2019. GIRNet: Interleaved Multi-Task Recurrent State Sequence Models. Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 2019), 6497–6504.Google ScholarDigital Library
Pankaj Gupta, Hinrich Schütze, and Bernt Andrassy. 2016. Table Filling Multi-Task Recurrent Neural Network for Joint Entity and Relation Extraction. In Proceedings of the 26th International Conference on Computational Linguistics. The COLING 2016 Organizing Committee, 2537–2547.Google Scholar
Shashank Gupta, Subhabrata Mukherjee, Krishan Subudhi, Eduardo Gonzalez, Damien Jose, Ahmed H. Awadallah, and Jianfeng Gao. 2022. Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners. arxiv:2204.07689 [cs.LG]Google Scholar
Zhen Hai, Peilin Zhao, Peng Cheng, Peng Yang, Xiao-Li Li, and Guangxia Li. 2016. Deceptive Review Spam Detection via Exploiting Task Relatedness and Unlabeled Data. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1817–1826.Google ScholarCross Ref
Jan Hajic, Eva Hajicová, Jarmila Panevová, Petr Sgall, Ondrej Bojar, Silvie Cinková, Eva Fucíková, Marie Mikulová, Petr Pajas, Jan Popelka, et al. 2012. Announcing Prague Czech-English Dependency Treebank 2.0.. In LREC. 3153–3160.Google Scholar
Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, and Richard Socher. 2017. A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1923–1933.Google ScholarCross Ref
Ruidan He, Wee Sun Lee, Hwee Tou Ng, and Daniel Dahlmeier. 2019. An Interactive Multi-Task Learning Network for End-to-End Aspect-Based Sentiment Analysis. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 504–515.Google ScholarCross Ref
Charles T. Hemphill, John J. Godfrey, and George R. Doddington. 1990. The ATIS Spoken Language Systems Pilot Corpus. In Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27,1990.Google Scholar
Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. 2020. Measuring Massive Multitask Language Understanding. In International Conference on Learning Representations.Google Scholar
Daniel Hershcovich, Omri Abend, and Ari Rappoport. 2018. Multitask Parsing Across Semantic Representations. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 373–385.Google ScholarCross Ref
Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, and Melvin Johnson. 2020. XTREME: A Massively Multilingual Multi-Task Benchmark for Evaluating Cross-Lingual Generalization. In Proceedings of the 37th International Conference on Machine Learning (ICML). July 2020 (July 2020). arxiv:2003.11080Google Scholar
Haoyang Huang, Yaobo Liang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, and Ming Zhou. 2019. Unicoder: A Universal Language Encoder by Pre-Training with Multiple Cross-Lingual Tasks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 2485–2494.Google Scholar
Masaru Isonuma, Toru Fujino, Junichiro Mori, Yutaka Matsuo, and Ichiro Sakata. 2017. Extractive Summarization Using Multi-Task Learning with Document Classification. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2101–2110.Google ScholarCross Ref
Di Jin, Shuyang Gao, Jiun-Yu Kao, Tagyoung Chung, and Dilek Hakkani-tur. 2020. MMM: Multi-Stage Multi-Task Learning for Multi-Choice Reading Comprehension. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 8010–8017.Google ScholarCross Ref
Shafiq Joty, Lluís Màrquez, and Preslav Nakov. 2018. Joint Multitask Learning for Community Question Answering Using Task-Specific Embeddings. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 4196–4207.Google ScholarCross Ref
Rabeeh Karimi Mahabadi, Sebastian Ruder, Mostafa Dehghani, and James Henderson. 2021. Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks. In Annual Meeting of the Association for Computational Linguistics.Google ScholarCross Ref
Young Jin Kim, Ammar Ahmad Awan, Alexandre Muzio, Andres Felipe Cruz Salinas, Liyang Lu, Amr Hendy, Samyam Rajbhandari, Yuxiong He, and Hany Hassan Awadalla. 2021. Scalable and Efficient MoE Training for Multitask Multilingual Models. arxiv:2109.10465 [cs.CL]Google Scholar
Elena Kochkina, Maria Liakata, and Arkaitz Zubiaga. 2018. All-in-One: Multi-Task Learning for Rumour Verification. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 3402–3413.Google Scholar
Shuhei Kurita and Anders Søgaard. 2019. Multi-Task Semantic Dependency Parsing with Policy Gradient for Learning Easy-First Strategies. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2420–2430.Google ScholarCross Ref
Sotiris Lamprinidis, Daniel Hardt, and Dirk Hovy. 2018. Predicting News Headline Popularity with Syntactic and Semantic Knowledge Using Multi-Task Learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 659–664.Google ScholarCross Ref
Man Lan, Jianxiang Wang, Yuanbin Wu, Zheng-Yu Niu, and Haifeng Wang. 2017. Multi-Task Attention-Based Neural Networks for Implicit Discourse Relationship Representation and Identification. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1299–1308.Google ScholarCross Ref
Anne Lauscher, Goran Glavaš, Simone Paolo Ponzetto, and Kai Eckert. 2018. Investigating the Role of Argumentation in the Rhetorical Analysis of Scientific Publications with Neural Multi-Task Learning Models. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 3326–3338.Google ScholarCross Ref
Duong Le, My Thai, and Thien Nguyen. 2020. Multi-Task Learning for Metaphor Detection with Graph Convolutional Neural Networks and Word Sense Disambiguation. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 8139–8146.Google ScholarCross Ref
Quanzhi Li, Qiong Zhang, and Luo Si. 2019. Rumor Detection by Exploiting User Credibility Information, Attention and Multi-Task Learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1173–1179.Google ScholarCross Ref
Shoushan Li and Chengqing Zong. 2008. Multi-Domain Sentiment Classification. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 257–260.Google ScholarCross Ref
Xin Li and Wai Lam. 2017. Deep Multi-Task Learning for Aspect Term Extraction with Memory Interaction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2886–2892.Google ScholarCross Ref
Xiang Lisa Li and Percy Liang. 2021. Prefix-Tuning: Optimizing Continuous Prompts for Generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (Eds.). Association for Computational Linguistics, Online, 4582–4597. https://doi.org/10.18653/v1/2021.acl-long.353Google ScholarCross Ref
Yingjie Li and Cornelia Caragea. 2019. Multi-Task Stance Detection with Sentiment and Stance Lexicons. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 6299–6305.Google Scholar
Yaobo Liang, Nan Duan, Yeyun Gong, Ning Wu, Fenfei Guo, Weizhen Qi, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Xiaodong Fan, Ruofei Zhang, Rahul Agrawal, Edward Cui, Sining Wei, Taroon Bharti, Ying Qiao, Jiun-Hung Chen, Winnie Wu, Shuguang Liu, Fan Yang, Daniel Campos, Rangan Majumder, and Ming Zhou. 2020. XGLUE: A New Benchmark Datasetfor Cross-Lingual Pre-Training, Understanding and Generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 6008–6018.Google ScholarCross Ref
KyungTae Lim, Jay Yoon Lee, Jaime Carbonell, and Thierry Poibeau. 2020. Semi-Supervised Learning on Meta Structure: Multi-Task Tagging and Parsing in Low-Resource Scenarios. In Proceedings of the AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence (Ed.). Association for the Advancement of Artificial Intelligence.Google ScholarCross Ref
Ying Lin, Shengqi Yang, Veselin Stoyanov, and Heng Ji. 2018. A Multi-Lingual Multi-Task Architecture for Low-Resource Sequence Labeling. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 799–809.Google ScholarCross Ref
Changsong Liu, Shaohua Yang, Sari Saba-Sadiya, Nishant Shukla, Yunzhong He, Song-Chun Zhu, and Joyce Chai. 2016. Jointly Learning Grounded Task Structures from Language Instruction and Visual Demonstration. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1482–1492.Google ScholarCross Ref
Jiahua Liu, Wan Wei, Maosong Sun, Hao Chen, Yantao Du, and Dekang Lin. 2018. A Multi-Answer Multi-Task Framework for Real-World Machine Reading Comprehension. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2109–2118.Google ScholarCross Ref
Lizhen Liu, Xiao Hu, Wei Song, Ruiji Fu, Ting Liu, and Guoping Hu. 2018. Neural Multitask Learning for Simile Recognition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1543–1553.Google ScholarCross Ref
Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2016. Deep Multi-Task Learning with Shared Memory for Text Classification. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 118–127.Google ScholarCross Ref
Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2017. Adversarial Multi-Task Learning for Text Classification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1–10.Google ScholarCross Ref
Yang Liu, Sujian Li, Xiaodong Zhang, and Zhifang Sui. 2016. Implicit Discourse Relation Classification via Multi-Task Neural Networks. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
Yi Luan, Luheng He, Mari Ostendorf, and Hannaneh Hajishirzi. 2018. Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 3219–3232.Google ScholarCross Ref
Minh-Thang Luong, Quoc V. Le, Ilya Sutskever, Oriol Vinyals, and Lukasz Kaiser. 2016. Multi-Task Sequence to Sequence Learning. International Conference on Learning Representations 2016 (March 2016). arxiv:1511.06114Google Scholar
Mounica Maddela, Wei Xu, and Daniel Preoţiuc-Pietro. 2019. Multi-Task Pairwise Neural Ranking for Hashtag Segmentation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2538–2549.Google ScholarCross Ref
Yuren Mao, Shuang Yun, Weiwei Liu, and Bo Du. 2020. Tchebycheff Procedure for Multi-Task Text Classification. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 4217–4226.Google ScholarCross Ref
Mitchell Marcus, Grace Kim, Mary Ann Marcinkiewicz, Robert MacIntyre, Ann Bies, Mark Ferguson, Karen Katz, and Britta Schasberger. 1994. The Penn Treebank: Annotating Predicate Argument Structure. In Human Language Technology: Proceedings of a Workshop Held at Plainsboro, New Jersey, March 8-11, 1994.Google Scholar
Héctor Martínez Alonso and Barbara Plank. 2017. When Is Multitask Learning Effective? Semantic Sequence Prediction under Varying Data Conditions. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. Association for Computational Linguistics, 44–53.Google ScholarCross Ref
Ryo Masumura, Yusuke Shinohara, Ryuichiro Higashinaka, and Yushi Aono. 2018. Adversarial Training for Multi-Task and Multi-Lingual Joint Modeling of Utterance Intent Classification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 633–639.Google ScholarCross Ref
Ryo Masumura, Tomohiro Tanaka, Ryuichiro Higashinaka, Hirokazu Masataki, and Yushi Aono. 2018. Multi-Task and Multi-Lingual Joint Learning of Neural Lexical Utterance Classification Based on Partially-Shared Modeling. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 3586–3596.Google Scholar
Abhijit Mishra, Srikanth Tamilselvam, Riddhiman Dasgupta, Seema Nagar, and Kuntal Dey. 2018. Cognition-Cognizant Sentiment Analysis With Multitask Subjectivity Summarization Based on Annotators’ Gaze Behavior. In Proceedings of the AAAI Conference on Artificial Intelligence.Google Scholar
Swaroop Mishra, Daniel Khashabi, Chitta Baral, and Hannaneh Hajishirzi. 2022. Cross-Task Generalization via Natural Language Crowdsourcing Instructions. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, Dublin, Ireland, 3470–3487. https://doi.org/10.18653/v1/2022.acl-long.244Google ScholarCross Ref
Nikola Mrkšić, Diarmuid Ó Séaghdha, Blaise Thomson, Milica Gašić, Pei-Hao Su, David Vandyke, Tsung-Hsien Wen, and Steve Young. 2015. Multi-Domain Dialog State Tracking Using Recurrent Neural Networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Association for Computational Linguistics, 794–799.Google Scholar
Kosuke Nishida, Kyosuke Nishida, Masaaki Nagata, Atsushi Otsuka, Itsumi Saito, Hisako Asano, and Junji Tomita. 2019. Answering While Summarizing: Multi-Task Learning for Multi-Hop QA with Evidence Extraction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2335–2345.Google ScholarCross Ref
Toru Nishino, Shotaro Misawa, Ryuji Kano, Tomoki Taniguchi, Yasuhide Miura, and Tomoko Ohkuma. 2019. Keeping Consistency of Sentence Generation and Document Classification with Multi-Task Learning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 3195–3205.Google Scholar
Xing Niu, Sudha Rao, and Marine Carpuat. 2018. Multi-Task Neural Models for Translating Between Styles Within and Across Languages. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 1008–1021.Google Scholar
Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Yoav Goldberg, Jan Hajič, Christopher D. Manning, Ryan McDonald, Slav Petrov, Sampo Pyysalo, Natalia Silveira, Reut Tsarfaty, and Daniel Zeman. 2016. Universal Dependencies v1: A Multilingual Treebank Collection. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). European Language Resources Association (ELRA), 1659–1666.Google Scholar
Stephan Oepen, Marco Kuhlmann, Yusuke Miyao, Daniel Zeman, Silvie Cinková, Dan Flickinger, Jan Hajič, Angelina Ivanova, and Zdeňka Urešová. 2016. Towards Comparability of Linguistic Graph Banks for Semantic Parsing. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). European Language Resources Association (ELRA), 3991–3995.Google Scholar
Ramakanth Pasunuru and Mohit Bansal. 2017. Multi-Task Video Captioning with Video and Entailment Generation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1273–1283.Google ScholarCross Ref
Ramakanth Pasunuru and Mohit Bansal. 2019. Continual and Multi-Task Architecture Search. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1911–1922.Google ScholarCross Ref
Hao Peng, Sam Thomson, and Noah A. Smith. 2017. Deep Multitask Learning for Semantic Dependency Parsing. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 2037–2048.Google ScholarCross Ref
Shiva Pentyala, Mengwen Liu, and Markus Dreyer. 2019. Multi-Task Networks with Universe, Group, and Task Feature Learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 820–830.Google ScholarCross Ref
Vittorio Perera, Tagyoung Chung, Thomas Kollar, and Emma Strubell. 2018. Multi-Task Learning For Parsing The Alexa Meaning Representation Language. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarCross Ref
Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, and Sebastian Ruder. 2020. MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. arxiv:2005.00052Google ScholarCross Ref
Hai Pham, Young Jin Kim, Subhabrata Mukherjee, David P. Woodruff, Barnabas Poczos, and Hany Hassan. 2023. Task-Based MoE for Multitask Multilingual Machine Translation. In Proceedings of the 3rd Workshop on Multi-lingual Representation Learning (MRL). Association for Computational Linguistics, Singapore, 164–172. https://doi.org/10.18653/v1/2023.mrl-1.13Google ScholarCross Ref
Jonathan Pilault, Amine El hattami, and Christopher Pal. 2021. Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data. In International Conference on Learning Representations.Google Scholar
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21, 140 (2020), 1–67. http://jmlr.org/papers/v21/20-074.htmlGoogle Scholar
Bhanu Pratap Singh Rawat, Fei Li, and Hong Yu. 2019. Naranjo Question Answering Using End-to-End Multi-Task Learning Model. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2547–2555.Google ScholarDigital Library
Marek Rei. 2017. Semi-Supervised Multitask Learning for Sequence Labeling. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 2121–2130.Google ScholarCross Ref
Qiyu Ren, Xiang Cheng, and Sen Su. 2020. Multi-Task Learning with Generative Adversarial Training for Multi-Passage Machine Reading Comprehension. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 8705–8712.Google ScholarCross Ref
Kervy Rivas Rojas, Gina Bustamante, Arturo Oncevay, and Marco Antonio Sobrevilla Cabezudo. 2020. Efficient Strategies for Hierarchical Text Classification: External Knowledge and Auxiliary Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2252–2257.Google ScholarCross Ref
Sebastian Ruder, Joachim Bingel, Isabelle Augenstein, and Anders Søgaard. 2019. Latent Multi-Task Architecture Learning. Proceedings of the AAAI Conference on Artificial Intelligence 33, 01(July 2019), 4822–4829.Google ScholarDigital Library
Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton. 2017. Dynamic Routing between Capsules. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc.Google Scholar
Victor Sanh, Albert Webson, Colin Raffel, Stephen Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Teven Le Scao, Stella Biderman, Leo Gao, Thomas Wolf, and Alexander M Rush. 2022. Multitask Prompted Training Enables Zero-Shot Task Generalization. In International Conference on Learning Representations. https://openreview.net/forum?id=9Vrb9D0WI4Google Scholar
Victor Sanh, Thomas Wolf, and Sebastian Ruder. 2019. A Hierarchical Multi-Task Approach for Learning Embeddings from Semantic Tasks. Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 2019), 6949–6956.Google ScholarDigital Library
Sheikh Muhammad Sarwar, Hamed Bonab, and James Allan. 2019. A Multi-Task Architecture on Relevance-Based Neural Query Translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 6339–6344.Google ScholarCross Ref
Fynn Schröder and Chris Biemann. 2020. Estimating the Influence of Auxiliary Tasks for Multi-Task Learning of Sequence Tagging Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2971–2985.Google ScholarCross Ref
Bo Shao, Yeyun Gong, Junwei Bao, Jianshu Ji, Guihong Cao, Xiaola Lin, and Nan Duan. 2019. Weakly Supervised Multi-Task Learning for Semantic Parsing. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 3375–3381.Google ScholarCross Ref
Noam Shazeer, *Azalia Mirhoseini, *Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. 2017. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. In International Conference on Learning Representations. https://openreview.net/forum?id=B1ckMDqlgGoogle Scholar
Tao Shen, Xiubo Geng, Tao Qin, Daya Guo, Duyu Tang, Nan Duan, Guodong Long, and Daxin Jiang. 2019. Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 2442–2451.Google Scholar
Kazuya Shimura, Jiyi Li, and Fumiyo Fukumoto. 2019. Text Categorization by Learning Predominant Sense of Words as Auxiliary Task. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1109–1119.Google ScholarCross Ref
Karan Singla, Dogan Can, and Shrikanth Narayanan. 2018. A Multi-Task Approach to Learning Multilingual Representations. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 214–220.Google ScholarCross Ref
Anders Søgaard and Yoav Goldberg. 2016. Deep Multi-Task Learning with Low Level Tasks Supervised at Lower Layers. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 231–235.Google ScholarCross Ref
Hyun-Je Song and Seong-Bae Park. 2019. Korean Morphological Analysis with Tied Sequence-to-Sequence Multi-Task Model. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 1436–1441.Google Scholar
Linfeng Song, Kun Xu, Yue Zhang, Jianshu Chen, and Dong Yu. 2020. ZPR2: Joint Zero Pronoun Recovery and Resolution Using Multi-Task Learning and BERT. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 5429–5434.Google ScholarCross Ref
Wei Song, Ziyao Song, Lizhen Liu, and Ruiji Fu. 2020. Hierarchical Multi-Task Learning for Organization Evaluation of Argumentative Student Essays. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 3875–3881.Google ScholarCross Ref
Asa Cooper Stickland and Iain Murray. 2019. BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning. In In Proceedings of the 36th International Conference on Machine Learning (ICML). PMLR, 5986–5995.Google Scholar
Sandeep Subramanian, Adam Trischler, Yoshua Bengio, and Christopher J Pal. 2018. Learning General Purpose Distributed Sentence Representations via Large Scale Multi-Task Learning. In International Conference on Learning Representations.Google Scholar
Alessandro Suglia, Ioannis Konstas, Andrea Vanzo, Emanuele Bastianelli, Desmond Elliott, Stella Frank, and Oliver Lemon. 2020. CompGuessWhat?!: A Multi-Task Evaluation Framework for Grounded Language Learning. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 7625–7641.Google ScholarCross Ref
Shabnam Tafreshi and Mona Diab. 2018. Emotion Detection and Classification in a Multigenre Corpus with Joint Multi-Task Deep Learning. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 2905–2913.Google Scholar
Yi Tay, Zhe Zhao, Dara Bahri, Donald Metzler, and Da-Cheng Juan. 2020. HyperGrid Transformers: Towards A Single Model for Multiple Tasks. In International Conference on Learning Representations.Google Scholar
Bing Tian, Yong Zhang, Jin Wang, and Chunxiao Xing. 2019. Hierarchical Inter-Attention Network for Document Classification with Multi-Task Learning. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 3569–3575.Google ScholarCross Ref
Xiaowei Tong, Zhenxin Fu, Mingyue Shang, Dongyan Zhao, and Rui Yan. 2018. One ”Ruler” for All Languages: Multi-Lingual Dialogue Evaluation with Adversarial Multi-Task Learning. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 4432–4438.Google Scholar
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. arXiv:1706.03762 [cs] (Dec. 2017). arxiv:1706.03762 [cs]Google Scholar
Prashanth Vijayaraghavan, Soroush Vosoughi, and Deb Roy. 2017. Twitter Demographic Classification Using Deep Multi-Modal Multi-Task Learning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 478–483.Google ScholarCross Ref
Tu Vu, Brian Lester, Noah Constant, Rami Al-Rfou’, and Daniel Cer. 2022. SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, Dublin, Ireland, 5039–5059. https://doi.org/10.18653/v1/2022.acl-long.346Google ScholarCross Ref
Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. 2019. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems. In Advances in Neural Information Processing Systems, Vol. 32. Curran Associates, Inc.Google Scholar
Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. 2019. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In International Conference on Learning Representations 2019. Association for Computational Linguistics, 353–355.Google Scholar
Jiancheng Wang, Jingjing Wang, Changlong Sun, Shoushan Li, Xiaozhong Liu, Luo Si, Min Zhang, and Guodong Zhou. 2020. Sentiment Classification in Customer Service Dialogue with Topic-Aware Multi-Task Learning. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 9177–9184.Google ScholarCross Ref
Shaolei Wang, Wangxiang Che, Qi Liu, Pengda Qin, Ting Liu, and William Yang Wang. 2020. Multi-Task Self-Supervised Learning for Disfluency Detection. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 9193–9200.Google ScholarCross Ref
Tianyi Wang, Yating Zhang, Xiaozhong Liu, Changlong Sun, and Qiong Zhang. 2020. Masking Orchestration: Multi-Task Pretraining for Multi-Role Dialogue Representation Learning. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 9217–9224.Google ScholarCross Ref
Weichao Wang, Shi Feng, Wei Gao, Daling Wang, and Yifei Zhang. 2018. Personalized Microblog Sentiment Classification via Adversarial Cross-Lingual Multi-Task Learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 338–348.Google ScholarCross Ref
Yiren Wang, ChengXiang Zhai, and Hany Hassan Awadalla. 2020. Multi-Task Learning for Multilingual Neural Machine Translation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. arxiv:2010.02523Google ScholarCross Ref
Zhen Wang, Rameswar Panda, Leonid Karlinsky, Rogerio Feris, Huan Sun, and Yoon Kim. 2023. Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning. In The Eleventh International Conference on Learning Representations. https://openreview.net/forum?id=Nk2pDtuhTqGoogle Scholar
Zirui Wang, Yulia Tsvetkov, Orhan Firat, and Yuan Cao. 2020. Gradient Vaccine: Investigating and Improving Multi-Task Optimization in Massively Multilingual Models. In International Conference on Learning Representations.Google Scholar
Taiki Watanabe, Akihiro Tamura, Takashi Ninomiya, Takuya Makino, and Tomoya Iwakura. 2019. Multi-Task Learning for Chemical Named Entity Recognition with Chemical Compound Paraphrasing. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 6244–6249.Google Scholar
Jason Wei, Maarten Bosma, Vincent Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, and Quoc V Le. 2022. Finetuned Language Models are Zero-Shot Learners. In International Conference on Learning Representations. https://openreview.net/forum?id=gEZrGCozdqRGoogle Scholar
Fangzhao Wu and Yongfeng Huang. 2015. Collaborative Multi-Domain Sentiment Classification. In 2015 IEEE International Conference on Data Mining. 459–468.Google Scholar
Fangzhao Wu and Yongfeng Huang. 2016. Personalized Microblog Sentiment Classification via Multi-Task Learning. Proceedings of the AAAI Conference on Artificial Intelligence (2016), 7.Google ScholarCross Ref
Lianwei Wu, Yuan Rao, Haolin Jin, Ambreen Nazir, and Ling Sun. 2019. Different Absorption from the Same Sharing: Sifted Multi-Task Learning for Fake News Detection. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 4644–4653.Google Scholar
Qingrong Xia, Zhenghua Li, and Min Zhang. 2019. A Syntax-Aware Multi-Task Learning Framework for Chinese Semantic Role Labeling. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 5382–5392.Google Scholar
Liqiang Xiao, Honglun Zhang, Wenqing Chen, Yongkun Wang, and Yaohui Jin. 2018. Learning What to Share: Leaky Multi-Task Network for Text Classification. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 2055–2065.Google Scholar
Liqiang Xiao, Honglun Zhang, Wenqing Chen, Yongkun Wang, and Yaohui Jin. 2018. MCapsNet: Capsule Network for Text with Multi-Task Learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 4565–4574.Google ScholarCross Ref
Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, and Tao Yu. 2022. UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 602–631. https://doi.org/10.18653/v1/2022.emnlp-main.39Google ScholarCross Ref
Junjie Xing, Kenny Zhu, and Shaodian Zhang. 2018. Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 3619–3630.Google Scholar
Shweta Yadav, Asif Ekbal, Sriparna Saha, and Pushpak Bhattacharyya. 2019. A Unified Multi-Task Adversarial Learning Framework for Pharmacovigilance Mining. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 5234–5245.Google ScholarCross Ref
Min Yang, Lei Chen, Xiaojun Chen, Qingyao Wu, Wei Zhou, and Ying Shen. 2019. Knowledge-Enhanced Hierarchical Attention for Community Question Answering with Multi-Task and Adaptive Learning. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 5349–5355.Google ScholarCross Ref
Yongxin Yang and Timothy M Hospedales. 2015. A Unified Perspective on Multi-Domain and Multi-Task Learning. (2015), 9.Google Scholar
Wei Ye, Bo Li, Rui Xie, Zhonghao Sheng, Long Chen, and Shikun Zhang. 2019. Exploiting Entity BIO Tag Embeddings and Multi-Task Learning for Relation Extraction with Imbalanced Data. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1351–1360.Google ScholarCross Ref
Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. 2020. Gradient Surgery for Multi-Task Learning. Advances in Neural Information Processing Systems 33 (2020), 5824–5836.Google Scholar
Nasser Zalmout and Nizar Habash. 2019. Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1775–1786.Google ScholarCross Ref
Poorya Zaremoodi, Wray Buntine, and Gholamreza Haffari. 2018. Adaptive Knowledge Sharing in Multi-Task Learning: Improving Low-Resource Neural Machine Translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 656–661.Google ScholarCross Ref
Daojian Zeng, Haoran Zhang, and Qianying Liu. 2020. CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 9507–9514.Google ScholarCross Ref
Jiali Zeng, Linfeng Song, Jinsong Su, Jun Xie, Wei Song, and Jiebo Luo. 2020. Neural Simile Recognition with Cyclic Multitask Learning and Local Attention. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 9515–9522.Google ScholarCross Ref
Honglun Zhang, Liqiang Xiao, Wenqing Chen, Yongkun Wang, and Yaohui Jin. 2018. Multi-Task Label Embedding for Text Classification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 4545–4553.Google ScholarCross Ref
Honglun Zhang, Liqiang Xiao, Yongkun Wang, and Yaohui Jin. 2017. A Generalized Recurrent Neural Architecture for Text Classification with Multi-Task Learning. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 3385–3391.Google ScholarCross Ref
Shengyu Zhang, Linfeng Dong, Xiaoya Li, Sen Zhang, Xiaofei Sun, Shuhe Wang, Jiwei Li, Runyi Hu, Tianwei Zhang, Fei Wu, et al. 2023. Instruction Tuning for Large Language Models: A Survey. arXiv preprint arXiv:2308.10792(2023).Google Scholar
Yuxiang Zhang, Jiamei Fu, Dongyu She, Ying Zhang, Senzhang Wang, and Jufeng Yang. 2018. Text Emotion Distribution Learning via Multi-Task Convolutional Neural Network. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 4595–4601.Google ScholarCross Ref
Yu Zhang and Qiang Yang. 2021. A Survey on Multi-Task Learning. IEEE Transactions on Knowledge and Data Engineering (2021).Google ScholarCross Ref
He Zhao, Longtao Huang, Rong Zhang, Quan Lu, and Hui Xue. 2020. SpanMlt: A Span-Based Multi-Task Learning Framework for Pair-Wise Aspect and Opinion Terms Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 3239–3248.Google ScholarCross Ref
Sendong Zhao, Ting Liu, Sicheng Zhao, and Fei Wang. 2019. A Neural Multi-Task Learning Framework to Jointly Model Medical Named Entity Recognition and Normalization. Proceedings of the AAAI Conference on Artificial Intelligence 33, 01(July 2019), 817–824.Google ScholarDigital Library
Xin Zhao, Kun Zhou, Beichen Zhang, Zheng Gong, Zhipeng Chen, Yuanhang Zhou, Ji-Rong Wen, Jing Sha, Shijin Wang, Cong Liu, and Guoping Hu. 2023. JiuZhang 2.0: A Unified Chinese Pre-Trained Language Model for Multi-Task Mathematical Problem Solving. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (, Long Beach, CA, USA,) (KDD ’23). Association for Computing Machinery, New York, NY, USA, 5660–5672. https://doi.org/10.1145/3580305.3599850Google ScholarDigital Library
Renjie Zheng, Junkun Chen, and Xipeng Qiu. 2018. Same Representation, Different Attentions: Shareable Sentence Representation Learning from Multiple Tasks. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 4616–4622.Google ScholarCross Ref
Wenjie Zhou, Minghua Zhang, and Yunfang Wu. 2019. Multi-Task Learning with Language Modeling for Question Generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 3394–3399.Google Scholar
Chenguang Zhu, Michael Zeng, and Xuedong Huang. 2019. Multi-Task Learning for Natural Language Generation in Task-Oriented Dialogue. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 1261–1266.Google Scholar
Jinfeng Zhuang and Yu Liu. 2019. PinText: A Multitask Text Embedding System in Pinterest. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2653–2661.Google ScholarDigital Library

Index Terms

Multi-Task Learning in Natural Language Processing: An Overview
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning

Recommendations

Macular: A Multi-Task Adversarial Framework for Cross-Lingual Natural Language Understanding
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Cross-lingual natural language understanding~(NLU) aims to train NLU models on a source language and apply the models to NLU tasks in target languages, and is a fundamental task for many cross-language applications. Most of the existing cross-lingual ...
Read More
Empirical evaluation of multi-task learning in deep neural networks for natural language processing
Abstract
Multi-task learning (MTL) aims at boosting the overall performance of each individual task by leveraging useful information contained in multiple-related tasks. It has shown great success in natural language processing (NLP). Currently, a number ...
Read More
Metric-Guided Multi-task Learning
Foundations of Intelligent Systems
Abstract
Multi-task learning (MTL) aims to solve multiple related learning tasks simultaneously so that the useful information in one specific task can be utilized by other tasks in order to improve the learning performance of all tasks. Many ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Computing Surveys Just Accepted
ISSN:0360-0300
EISSN:1557-7341
Table of Contents

Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Online AM: 11 May 2024
- Accepted: 18 April 2024
- Revised: 12 January 2024
- Received: 25 September 2021
Check for updates
Author Tags
Multi-task learning
Qualifiers
- survey
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 131
  Total Downloads
- Downloads (Last 12 months)131
- Downloads (Last 6 weeks)131
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Multi-Task Learning in Natural Language Processing: An Overview

ACM Computing Surveys

Abstract

References

Cited By

Index Terms

Recommendations

Macular: A Multi-Task Adversarial Framework for Cross-Lingual Natural Language Understanding

Empirical evaluation of multi-task learning in deep neural networks for natural language processing

Metric-Guided Multi-task Learning

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media