skip to main content
survey
Free Access
Just Accepted

Multi-Task Learning in Natural Language Processing: An Overview

Online AM:11 May 2024Publication History
Skip Abstract Section

Abstract

Deep learning approaches have achieved great success in the field of Natural Language Processing (NLP). However, directly training deep neural models often suffer from overfitting and data scarcity problems that are pervasive in NLP tasks. In recent years, Multi-Task Learning (MTL), which can leverage useful information of related tasks to achieve simultaneous performance improvement on these tasks, has been used to handle these problems. In this paper, we give an overview of the use of MTL in NLP tasks. We first review MTL architectures used in NLP tasks and categorize them into four classes, including parallel architecture, hierarchical architecture, modular architecture, and generative adversarial architecture. Then we present optimization techniques on loss construction, gradient regularization, data sampling, and task scheduling to properly train a multi-task model. After presenting applications of MTL in a variety of NLP tasks, we introduce some benchmark datasets. Finally, we make a conclusion and discuss several possible research directions in this field.

References

  1. Sawsan Alqahtani, Ajay Mishra, and Mona Diab. 2020. A Multitask Learning Approach for Diacritic Restoration. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 8238–8247.Google ScholarGoogle ScholarCross RefCross Ref
  2. Maryam Aminian, Mohammad Sadegh Rasooli, and Mona Diab. 2020. Mutlitask Learning for Cross-Lingual Transfer of Semantic Dependencies. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. arxiv:2004.14961Google ScholarGoogle Scholar
  3. Akari Asai, Mohammadreza Salehi, Matthew Peters, and Hannaneh Hajishirzi. 2022. ATTEMPT: Parameter-Efficient Multi-task Tuning via Attentional Mixtures of Soft Prompts. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 6655–6672. https://doi.org/10.18653/v1/2022.emnlp-main.446Google ScholarGoogle ScholarCross RefCross Ref
  4. Isabelle Augenstein and Anders Søgaard. 2017. Multi-Task Learning of Keyphrase Boundary Classification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 341–346.Google ScholarGoogle ScholarCross RefCross Ref
  5. Laura Banarescu, Claire Bonial, Shu Cai, Madalina Georgescu, Kira Griffitt, Ulf Hermjakob, Kevin Knight, Philipp Koehn, Martha Palmer, and Nathan Schneider. 2013. Abstract Meaning Representation for Sembanking. In Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse. Association for Computational Linguistics, 178–186.Google ScholarGoogle Scholar
  6. Marcel Bollmann and Anders Søgaard. 2016. Improving Historical Spelling Normalization with Bi-Directional LSTMs and Multi-Task Learning. In Proceedings of the 26th International Conference on Computational Linguistics. The COLING 2016 Organizing Committee, 131–139.Google ScholarGoogle Scholar
  7. Chloé Braud, Barbara Plank, and Anders Søgaard. 2016. Multi-View and Multi-Task Training of RST Discourse Parsers. In Proceedings of the 26th International Conference on Computational Linguistics. 1903–1913.Google ScholarGoogle Scholar
  8. Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol.  33. Curran Associates, Inc., 1877–1901. https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdfGoogle ScholarGoogle Scholar
  9. Rich Caruana. 1997. Multitask Learning. Machine Learning 28, 1 (1997), 41–75.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Christophe Cerisara, Somayeh Jafaritazehjani, Adedayo Oluokun, and Hoa T. Le. 2018. Multi-Task Dialog Act and Sentiment Recognition on Mastodon. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 745–754.Google ScholarGoogle Scholar
  11. Shuaichen Chang, Pengfei Liu, Yun Tang, Jing Huang, Xiaodong He, and Bowen Zhou. 2020. Zero-Shot Text-to-SQL Learning with Auxiliary Task. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 7488–7495.Google ScholarGoogle ScholarCross RefCross Ref
  12. Soravit Changpinyo, Hexiang Hu, and Fei Sha. 2018. Multi-Task Learning for Sequence Tagging: An Empirical Study. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 2965–2977.Google ScholarGoogle Scholar
  13. Devendra Singh Chaplot, Lisa Lee, Ruslan Salakhutdinov, Devi Parikh, and Dhruv Batra. 2020. Embodied Multimodal Multitask Learning. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 2442–2448.Google ScholarGoogle ScholarCross RefCross Ref
  14. Dushyant Singh Chauhan, Dhanush S R, Asif Ekbal, and Pushpak Bhattacharyya. 2020. Sentiment and Emotion Help Sarcasm? A Multi-Task Learning Framework for Multi-Modal Sarcasm, Sentiment and Emotion Analysis. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 4351–4360.Google ScholarGoogle ScholarCross RefCross Ref
  15. Junkun Chen, Xipeng Qiu, Pengfei Liu, and Xuanjing Huang. 2018. Meta Multi-Task Learning for Sequence Modeling. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  16. Long Chen, Ziyu Guan, Wei Zhao, Wanqing Zhao, Xiaopeng Wang, Zhou Zhao, and Huan Sun. 2019. Answer Identification from Product Reviews for User Questions by Multi-Task Attentive Networks. Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 2019), 45–52.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Liying Cheng, Lidong Bing, Qian Yu, Wei Lu, and Luo Si. 2020. APE: Argument Pair Extraction from Peer Review and Rebuttal via Multi-Task Learning. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 7000–7011.Google ScholarGoogle ScholarCross RefCross Ref
  18. Shun-Po Chuang, Tzu-Wei Sung, Alexander H. Liu, and Hung-yi Lee. 2020. Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 5998–6003.Google ScholarGoogle ScholarCross RefCross Ref
  19. Roberto Cipolla, Yarin Gal, and Alex Kendall. 2018. Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 7482–7491.Google ScholarGoogle ScholarCross RefCross Ref
  20. Kevin Clark, Minh-Thang Luong, Urvashi Khandelwal, Christopher D. Manning, and Quoc V. Le. 2019. BAM! Born-Again Multi-Task Networks for Natural Language Understanding. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 5931–5937.Google ScholarGoogle ScholarCross RefCross Ref
  21. Ronan Collobert and Jason Weston. 2008. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. In Proceedings of the 25th International Conference on Machine Learning (ICML ’08). Association for Computing Machinery, 160–167.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Ronan Cummins, Meng Zhang, and Ted Briscoe. 2016. Constrained Multi-Task Learning for Automated Essay Scoring. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 789–799.Google ScholarGoogle ScholarCross RefCross Ref
  23. Verna Dankers, Marek Rei, Martha Lewis, and Ekaterina Shutova. 2019. Modelling the Interplay of Metaphor and Emotion through Multitask Learning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 2218–2229.Google ScholarGoogle Scholar
  24. José G. C. de Souza, Matteo Negri, Elisa Ricci, and Marco Turchi. 2015. Online Multitask Learning for Machine Translation Quality Estimation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, 219–228.Google ScholarGoogle Scholar
  25. Yang Deng, Yuexiang Xie, Yaliang Li, Min Yang, Nan Du, Wei Fan, Kai Lei, and Ying Shen. 2019. Multi-Task Learning with Multi-View Attention for Answer Selection and Knowledge Base Question Answering. Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 2019), 6318–6325.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, 4171–4186.Google ScholarGoogle Scholar
  27. Erik-Lân Do Dinh, Steffen Eger, and Iryna Gurevych. 2018. Killing Four Birds with Two Stones: Multi-Task Learning for Non-Literal Language Detection. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 1558–1569.Google ScholarGoogle Scholar
  28. Tobias Domhan and Felix Hieber. 2017. Using Target-Side Monolingual Data for Neural Machine Translation through Multi-Task Learning. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1500–1505.Google ScholarGoogle ScholarCross RefCross Ref
  29. Youmna Farag and Helen Yannakoudakis. 2019. Multi-Task Learning for Coherence Modeling. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 629–639.Google ScholarGoogle ScholarCross RefCross Ref
  30. Murhaf Fares, Stephan Oepen, and Erik Velldal. 2018. Transfer and Multi-Task Learning for Noun–Noun Compound Interpretation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1488–1498.Google ScholarGoogle Scholar
  31. Hongliang Fei, Shulong Tan, and Ping Li. 2019. Hierarchical Multi-Task Word Embedding Learning for Synonym Prediction. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 834–842.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Christiane Fellbaum. 2010. WordNet. In Theory and Applications of Ontology: Computer Applications. Springer, 231–243.Google ScholarGoogle Scholar
  33. Dan Flickinger, Yi Zhang, and Valia Kordoni. 2012. DeepBank. A Dynamically Annotated Treebank of the Wall Street Journal. In Proceedings of the 11th International Workshop on Treebanks and Linguistic Theories. 85–96.Google ScholarGoogle Scholar
  34. Ze-Feng Gao, Peiyu Liu, Wayne Xin Zhao, Zhong-Yi Lu, and Ji-Rong Wen. 2022. Parameter-Efficient Mixture-of-Experts Architecture for Pre-trained Language Models. In Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Gyeongju, Republic of Korea, 3263–3273. https://aclanthology.org/2022.coling-1.288Google ScholarGoogle Scholar
  35. Yu Gong, Xusheng Luo, Yu Zhu, Wenwu Ou, Zhao Li, Muhua Zhu, Kenny Q. Zhu, Lu Duan, and Xi Chen. 2019. Deep Cascade Multi-Task Learning for Slot Filling in Online Shopping Assistant. Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 2019), 6465–6472.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Ana Valeria Gonzalez, Maria Barrett, Rasmus Hvingelby, Kellie Webster, and Anders Søgaard. 2020. Type B Reflexivization as an Unambiguous Testbed for Multilingual Multi-Task Gender Bias. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. arxiv:2009.11982Google ScholarGoogle ScholarCross RefCross Ref
  37. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Advances in Neural Information Processing Systems, Vol.  27. Curran Associates, Inc.Google ScholarGoogle Scholar
  38. Ananth Gottumukkala, Dheeru Dua, Sameer Singh, and Matt Gardner. 2020. Dynamic Sampling Strategies for Multi-Task Reading Comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 920–924.Google ScholarGoogle ScholarCross RefCross Ref
  39. Han Guo, Ramakanth Pasunuru, and Mohit Bansal. 2018. Dynamic Multi-Level Multi-Task Learning for Sentence Simplification. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 462–476.Google ScholarGoogle Scholar
  40. Han Guo, Ramakanth Pasunuru, and Mohit Bansal. 2018. Soft Layer-Specific Multi-Task Summarization with Entailment and Question Generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 687–697.Google ScholarGoogle ScholarCross RefCross Ref
  41. Divam Gupta, Tanmoy Chakraborty, and Soumen Chakrabarti. 2019. GIRNet: Interleaved Multi-Task Recurrent State Sequence Models. Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 2019), 6497–6504.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Pankaj Gupta, Hinrich Schütze, and Bernt Andrassy. 2016. Table Filling Multi-Task Recurrent Neural Network for Joint Entity and Relation Extraction. In Proceedings of the 26th International Conference on Computational Linguistics. The COLING 2016 Organizing Committee, 2537–2547.Google ScholarGoogle Scholar
  43. Shashank Gupta, Subhabrata Mukherjee, Krishan Subudhi, Eduardo Gonzalez, Damien Jose, Ahmed H. Awadallah, and Jianfeng Gao. 2022. Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners. arxiv:2204.07689  [cs.LG]Google ScholarGoogle Scholar
  44. Zhen Hai, Peilin Zhao, Peng Cheng, Peng Yang, Xiao-Li Li, and Guangxia Li. 2016. Deceptive Review Spam Detection via Exploiting Task Relatedness and Unlabeled Data. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1817–1826.Google ScholarGoogle ScholarCross RefCross Ref
  45. Jan Hajic, Eva Hajicová, Jarmila Panevová, Petr Sgall, Ondrej Bojar, Silvie Cinková, Eva Fucíková, Marie Mikulová, Petr Pajas, Jan Popelka, et al. 2012. Announcing Prague Czech-English Dependency Treebank 2.0.. In LREC. 3153–3160.Google ScholarGoogle Scholar
  46. Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, and Richard Socher. 2017. A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1923–1933.Google ScholarGoogle ScholarCross RefCross Ref
  47. Ruidan He, Wee Sun Lee, Hwee Tou Ng, and Daniel Dahlmeier. 2019. An Interactive Multi-Task Learning Network for End-to-End Aspect-Based Sentiment Analysis. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 504–515.Google ScholarGoogle ScholarCross RefCross Ref
  48. Charles T. Hemphill, John J. Godfrey, and George R. Doddington. 1990. The ATIS Spoken Language Systems Pilot Corpus. In Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27,1990.Google ScholarGoogle Scholar
  49. Dan Hendrycks, Collin Burns, Steven Basart, Andy Zou, Mantas Mazeika, Dawn Song, and Jacob Steinhardt. 2020. Measuring Massive Multitask Language Understanding. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  50. Daniel Hershcovich, Omri Abend, and Ari Rappoport. 2018. Multitask Parsing Across Semantic Representations. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 373–385.Google ScholarGoogle ScholarCross RefCross Ref
  51. Junjie Hu, Sebastian Ruder, Aditya Siddhant, Graham Neubig, Orhan Firat, and Melvin Johnson. 2020. XTREME: A Massively Multilingual Multi-Task Benchmark for Evaluating Cross-Lingual Generalization. In Proceedings of the 37th International Conference on Machine Learning (ICML). July 2020 (July 2020). arxiv:2003.11080Google ScholarGoogle Scholar
  52. Haoyang Huang, Yaobo Liang, Nan Duan, Ming Gong, Linjun Shou, Daxin Jiang, and Ming Zhou. 2019. Unicoder: A Universal Language Encoder by Pre-Training with Multiple Cross-Lingual Tasks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 2485–2494.Google ScholarGoogle Scholar
  53. Masaru Isonuma, Toru Fujino, Junichiro Mori, Yutaka Matsuo, and Ichiro Sakata. 2017. Extractive Summarization Using Multi-Task Learning with Document Classification. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2101–2110.Google ScholarGoogle ScholarCross RefCross Ref
  54. Di Jin, Shuyang Gao, Jiun-Yu Kao, Tagyoung Chung, and Dilek Hakkani-tur. 2020. MMM: Multi-Stage Multi-Task Learning for Multi-Choice Reading Comprehension. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 8010–8017.Google ScholarGoogle ScholarCross RefCross Ref
  55. Shafiq Joty, Lluís Màrquez, and Preslav Nakov. 2018. Joint Multitask Learning for Community Question Answering Using Task-Specific Embeddings. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 4196–4207.Google ScholarGoogle ScholarCross RefCross Ref
  56. Rabeeh Karimi Mahabadi, Sebastian Ruder, Mostafa Dehghani, and James Henderson. 2021. Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks. In Annual Meeting of the Association for Computational Linguistics.Google ScholarGoogle ScholarCross RefCross Ref
  57. Young Jin Kim, Ammar Ahmad Awan, Alexandre Muzio, Andres Felipe Cruz Salinas, Liyang Lu, Amr Hendy, Samyam Rajbhandari, Yuxiong He, and Hany Hassan Awadalla. 2021. Scalable and Efficient MoE Training for Multitask Multilingual Models. arxiv:2109.10465  [cs.CL]Google ScholarGoogle Scholar
  58. Elena Kochkina, Maria Liakata, and Arkaitz Zubiaga. 2018. All-in-One: Multi-Task Learning for Rumour Verification. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 3402–3413.Google ScholarGoogle Scholar
  59. Shuhei Kurita and Anders Søgaard. 2019. Multi-Task Semantic Dependency Parsing with Policy Gradient for Learning Easy-First Strategies. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2420–2430.Google ScholarGoogle ScholarCross RefCross Ref
  60. Sotiris Lamprinidis, Daniel Hardt, and Dirk Hovy. 2018. Predicting News Headline Popularity with Syntactic and Semantic Knowledge Using Multi-Task Learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 659–664.Google ScholarGoogle ScholarCross RefCross Ref
  61. Man Lan, Jianxiang Wang, Yuanbin Wu, Zheng-Yu Niu, and Haifeng Wang. 2017. Multi-Task Attention-Based Neural Networks for Implicit Discourse Relationship Representation and Identification. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1299–1308.Google ScholarGoogle ScholarCross RefCross Ref
  62. Anne Lauscher, Goran Glavaš, Simone Paolo Ponzetto, and Kai Eckert. 2018. Investigating the Role of Argumentation in the Rhetorical Analysis of Scientific Publications with Neural Multi-Task Learning Models. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 3326–3338.Google ScholarGoogle ScholarCross RefCross Ref
  63. Duong Le, My Thai, and Thien Nguyen. 2020. Multi-Task Learning for Metaphor Detection with Graph Convolutional Neural Networks and Word Sense Disambiguation. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 8139–8146.Google ScholarGoogle ScholarCross RefCross Ref
  64. Quanzhi Li, Qiong Zhang, and Luo Si. 2019. Rumor Detection by Exploiting User Credibility Information, Attention and Multi-Task Learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1173–1179.Google ScholarGoogle ScholarCross RefCross Ref
  65. Shoushan Li and Chengqing Zong. 2008. Multi-Domain Sentiment Classification. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 257–260.Google ScholarGoogle ScholarCross RefCross Ref
  66. Xin Li and Wai Lam. 2017. Deep Multi-Task Learning for Aspect Term Extraction with Memory Interaction. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2886–2892.Google ScholarGoogle ScholarCross RefCross Ref
  67. Xiang Lisa Li and Percy Liang. 2021. Prefix-Tuning: Optimizing Continuous Prompts for Generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (Eds.). Association for Computational Linguistics, Online, 4582–4597. https://doi.org/10.18653/v1/2021.acl-long.353Google ScholarGoogle ScholarCross RefCross Ref
  68. Yingjie Li and Cornelia Caragea. 2019. Multi-Task Stance Detection with Sentiment and Stance Lexicons. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 6299–6305.Google ScholarGoogle Scholar
  69. Yaobo Liang, Nan Duan, Yeyun Gong, Ning Wu, Fenfei Guo, Weizhen Qi, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Xiaodong Fan, Ruofei Zhang, Rahul Agrawal, Edward Cui, Sining Wei, Taroon Bharti, Ying Qiao, Jiun-Hung Chen, Winnie Wu, Shuguang Liu, Fan Yang, Daniel Campos, Rangan Majumder, and Ming Zhou. 2020. XGLUE: A New Benchmark Datasetfor Cross-Lingual Pre-Training, Understanding and Generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, 6008–6018.Google ScholarGoogle ScholarCross RefCross Ref
  70. KyungTae Lim, Jay Yoon Lee, Jaime Carbonell, and Thierry Poibeau. 2020. Semi-Supervised Learning on Meta Structure: Multi-Task Tagging and Parsing in Low-Resource Scenarios. In Proceedings of the AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence (Ed.). Association for the Advancement of Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  71. Ying Lin, Shengqi Yang, Veselin Stoyanov, and Heng Ji. 2018. A Multi-Lingual Multi-Task Architecture for Low-Resource Sequence Labeling. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 799–809.Google ScholarGoogle ScholarCross RefCross Ref
  72. Changsong Liu, Shaohua Yang, Sari Saba-Sadiya, Nishant Shukla, Yunzhong He, Song-Chun Zhu, and Joyce Chai. 2016. Jointly Learning Grounded Task Structures from Language Instruction and Visual Demonstration. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1482–1492.Google ScholarGoogle ScholarCross RefCross Ref
  73. Jiahua Liu, Wan Wei, Maosong Sun, Hao Chen, Yantao Du, and Dekang Lin. 2018. A Multi-Answer Multi-Task Framework for Real-World Machine Reading Comprehension. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2109–2118.Google ScholarGoogle ScholarCross RefCross Ref
  74. Lizhen Liu, Xiao Hu, Wei Song, Ruiji Fu, Ting Liu, and Guoping Hu. 2018. Neural Multitask Learning for Simile Recognition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1543–1553.Google ScholarGoogle ScholarCross RefCross Ref
  75. Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2016. Deep Multi-Task Learning with Shared Memory for Text Classification. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 118–127.Google ScholarGoogle ScholarCross RefCross Ref
  76. Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2017. Adversarial Multi-Task Learning for Text Classification. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1–10.Google ScholarGoogle ScholarCross RefCross Ref
  77. Yang Liu, Sujian Li, Xiaodong Zhang, and Zhifang Sui. 2016. Implicit Discourse Relation Classification via Multi-Task Neural Networks. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  78. Yi Luan, Luheng He, Mari Ostendorf, and Hannaneh Hajishirzi. 2018. Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 3219–3232.Google ScholarGoogle ScholarCross RefCross Ref
  79. Minh-Thang Luong, Quoc V. Le, Ilya Sutskever, Oriol Vinyals, and Lukasz Kaiser. 2016. Multi-Task Sequence to Sequence Learning. International Conference on Learning Representations 2016 (March 2016). arxiv:1511.06114Google ScholarGoogle Scholar
  80. Mounica Maddela, Wei Xu, and Daniel Preoţiuc-Pietro. 2019. Multi-Task Pairwise Neural Ranking for Hashtag Segmentation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2538–2549.Google ScholarGoogle ScholarCross RefCross Ref
  81. Yuren Mao, Shuang Yun, Weiwei Liu, and Bo Du. 2020. Tchebycheff Procedure for Multi-Task Text Classification. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 4217–4226.Google ScholarGoogle ScholarCross RefCross Ref
  82. Mitchell Marcus, Grace Kim, Mary Ann Marcinkiewicz, Robert MacIntyre, Ann Bies, Mark Ferguson, Karen Katz, and Britta Schasberger. 1994. The Penn Treebank: Annotating Predicate Argument Structure. In Human Language Technology: Proceedings of a Workshop Held at Plainsboro, New Jersey, March 8-11, 1994.Google ScholarGoogle Scholar
  83. Héctor Martínez Alonso and Barbara Plank. 2017. When Is Multitask Learning Effective? Semantic Sequence Prediction under Varying Data Conditions. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. Association for Computational Linguistics, 44–53.Google ScholarGoogle ScholarCross RefCross Ref
  84. Ryo Masumura, Yusuke Shinohara, Ryuichiro Higashinaka, and Yushi Aono. 2018. Adversarial Training for Multi-Task and Multi-Lingual Joint Modeling of Utterance Intent Classification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 633–639.Google ScholarGoogle ScholarCross RefCross Ref
  85. Ryo Masumura, Tomohiro Tanaka, Ryuichiro Higashinaka, Hirokazu Masataki, and Yushi Aono. 2018. Multi-Task and Multi-Lingual Joint Learning of Neural Lexical Utterance Classification Based on Partially-Shared Modeling. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 3586–3596.Google ScholarGoogle Scholar
  86. Abhijit Mishra, Srikanth Tamilselvam, Riddhiman Dasgupta, Seema Nagar, and Kuntal Dey. 2018. Cognition-Cognizant Sentiment Analysis With Multitask Subjectivity Summarization Based on Annotators’ Gaze Behavior. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarGoogle Scholar
  87. Swaroop Mishra, Daniel Khashabi, Chitta Baral, and Hannaneh Hajishirzi. 2022. Cross-Task Generalization via Natural Language Crowdsourcing Instructions. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, Dublin, Ireland, 3470–3487. https://doi.org/10.18653/v1/2022.acl-long.244Google ScholarGoogle ScholarCross RefCross Ref
  88. Nikola Mrkšić, Diarmuid Ó Séaghdha, Blaise Thomson, Milica Gašić, Pei-Hao Su, David Vandyke, Tsung-Hsien Wen, and Steve Young. 2015. Multi-Domain Dialog State Tracking Using Recurrent Neural Networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). Association for Computational Linguistics, 794–799.Google ScholarGoogle Scholar
  89. Kosuke Nishida, Kyosuke Nishida, Masaaki Nagata, Atsushi Otsuka, Itsumi Saito, Hisako Asano, and Junji Tomita. 2019. Answering While Summarizing: Multi-Task Learning for Multi-Hop QA with Evidence Extraction. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2335–2345.Google ScholarGoogle ScholarCross RefCross Ref
  90. Toru Nishino, Shotaro Misawa, Ryuji Kano, Tomoki Taniguchi, Yasuhide Miura, and Tomoko Ohkuma. 2019. Keeping Consistency of Sentence Generation and Document Classification with Multi-Task Learning. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 3195–3205.Google ScholarGoogle Scholar
  91. Xing Niu, Sudha Rao, and Marine Carpuat. 2018. Multi-Task Neural Models for Translating Between Styles Within and Across Languages. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 1008–1021.Google ScholarGoogle Scholar
  92. Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Yoav Goldberg, Jan Hajič, Christopher D. Manning, Ryan McDonald, Slav Petrov, Sampo Pyysalo, Natalia Silveira, Reut Tsarfaty, and Daniel Zeman. 2016. Universal Dependencies v1: A Multilingual Treebank Collection. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). European Language Resources Association (ELRA), 1659–1666.Google ScholarGoogle Scholar
  93. Stephan Oepen, Marco Kuhlmann, Yusuke Miyao, Daniel Zeman, Silvie Cinková, Dan Flickinger, Jan Hajič, Angelina Ivanova, and Zdeňka Urešová. 2016. Towards Comparability of Linguistic Graph Banks for Semantic Parsing. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16). European Language Resources Association (ELRA), 3991–3995.Google ScholarGoogle Scholar
  94. Ramakanth Pasunuru and Mohit Bansal. 2017. Multi-Task Video Captioning with Video and Entailment Generation. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 1273–1283.Google ScholarGoogle ScholarCross RefCross Ref
  95. Ramakanth Pasunuru and Mohit Bansal. 2019. Continual and Multi-Task Architecture Search. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1911–1922.Google ScholarGoogle ScholarCross RefCross Ref
  96. Hao Peng, Sam Thomson, and Noah A. Smith. 2017. Deep Multitask Learning for Semantic Dependency Parsing. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 2037–2048.Google ScholarGoogle ScholarCross RefCross Ref
  97. Shiva Pentyala, Mengwen Liu, and Markus Dreyer. 2019. Multi-Task Networks with Universe, Group, and Task Feature Learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 820–830.Google ScholarGoogle ScholarCross RefCross Ref
  98. Vittorio Perera, Tagyoung Chung, Thomas Kollar, and Emma Strubell. 2018. Multi-Task Learning For Parsing The Alexa Meaning Representation Language. In Proceedings of the AAAI Conference on Artificial Intelligence.Google ScholarGoogle ScholarCross RefCross Ref
  99. Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, and Sebastian Ruder. 2020. MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. arxiv:2005.00052Google ScholarGoogle ScholarCross RefCross Ref
  100. Hai Pham, Young Jin Kim, Subhabrata Mukherjee, David P. Woodruff, Barnabas Poczos, and Hany Hassan. 2023. Task-Based MoE for Multitask Multilingual Machine Translation. In Proceedings of the 3rd Workshop on Multi-lingual Representation Learning (MRL). Association for Computational Linguistics, Singapore, 164–172. https://doi.org/10.18653/v1/2023.mrl-1.13Google ScholarGoogle ScholarCross RefCross Ref
  101. Jonathan Pilault, Amine El hattami, and Christopher Pal. 2021. Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  102. Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21, 140 (2020), 1–67. http://jmlr.org/papers/v21/20-074.htmlGoogle ScholarGoogle Scholar
  103. Bhanu Pratap Singh Rawat, Fei Li, and Hong Yu. 2019. Naranjo Question Answering Using End-to-End Multi-Task Learning Model. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2547–2555.Google ScholarGoogle ScholarDigital LibraryDigital Library
  104. Marek Rei. 2017. Semi-Supervised Multitask Learning for Sequence Labeling. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, 2121–2130.Google ScholarGoogle ScholarCross RefCross Ref
  105. Qiyu Ren, Xiang Cheng, and Sen Su. 2020. Multi-Task Learning with Generative Adversarial Training for Multi-Passage Machine Reading Comprehension. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 8705–8712.Google ScholarGoogle ScholarCross RefCross Ref
  106. Kervy Rivas Rojas, Gina Bustamante, Arturo Oncevay, and Marco Antonio Sobrevilla Cabezudo. 2020. Efficient Strategies for Hierarchical Text Classification: External Knowledge and Auxiliary Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2252–2257.Google ScholarGoogle ScholarCross RefCross Ref
  107. Sebastian Ruder, Joachim Bingel, Isabelle Augenstein, and Anders Søgaard. 2019. Latent Multi-Task Architecture Learning. Proceedings of the AAAI Conference on Artificial Intelligence 33, 01(July 2019), 4822–4829.Google ScholarGoogle ScholarDigital LibraryDigital Library
  108. Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton. 2017. Dynamic Routing between Capsules. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol.  30. Curran Associates, Inc.Google ScholarGoogle Scholar
  109. Victor Sanh, Albert Webson, Colin Raffel, Stephen Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Teven Le Scao, Stella Biderman, Leo Gao, Thomas Wolf, and Alexander M Rush. 2022. Multitask Prompted Training Enables Zero-Shot Task Generalization. In International Conference on Learning Representations. https://openreview.net/forum?id=9Vrb9D0WI4Google ScholarGoogle Scholar
  110. Victor Sanh, Thomas Wolf, and Sebastian Ruder. 2019. A Hierarchical Multi-Task Approach for Learning Embeddings from Semantic Tasks. Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 2019), 6949–6956.Google ScholarGoogle ScholarDigital LibraryDigital Library
  111. Sheikh Muhammad Sarwar, Hamed Bonab, and James Allan. 2019. A Multi-Task Architecture on Relevance-Based Neural Query Translation. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 6339–6344.Google ScholarGoogle ScholarCross RefCross Ref
  112. Fynn Schröder and Chris Biemann. 2020. Estimating the Influence of Auxiliary Tasks for Multi-Task Learning of Sequence Tagging Tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2971–2985.Google ScholarGoogle ScholarCross RefCross Ref
  113. Bo Shao, Yeyun Gong, Junwei Bao, Jianshu Ji, Guihong Cao, Xiaola Lin, and Nan Duan. 2019. Weakly Supervised Multi-Task Learning for Semantic Parsing. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 3375–3381.Google ScholarGoogle ScholarCross RefCross Ref
  114. Noam Shazeer, *Azalia Mirhoseini, *Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, and Jeff Dean. 2017. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. In International Conference on Learning Representations. https://openreview.net/forum?id=B1ckMDqlgGoogle ScholarGoogle Scholar
  115. Tao Shen, Xiubo Geng, Tao Qin, Daya Guo, Duyu Tang, Nan Duan, Guodong Long, and Daxin Jiang. 2019. Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 2442–2451.Google ScholarGoogle Scholar
  116. Kazuya Shimura, Jiyi Li, and Fumiyo Fukumoto. 2019. Text Categorization by Learning Predominant Sense of Words as Auxiliary Task. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1109–1119.Google ScholarGoogle ScholarCross RefCross Ref
  117. Karan Singla, Dogan Can, and Shrikanth Narayanan. 2018. A Multi-Task Approach to Learning Multilingual Representations. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 214–220.Google ScholarGoogle ScholarCross RefCross Ref
  118. Anders Søgaard and Yoav Goldberg. 2016. Deep Multi-Task Learning with Low Level Tasks Supervised at Lower Layers. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 231–235.Google ScholarGoogle ScholarCross RefCross Ref
  119. Hyun-Je Song and Seong-Bae Park. 2019. Korean Morphological Analysis with Tied Sequence-to-Sequence Multi-Task Model. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 1436–1441.Google ScholarGoogle Scholar
  120. Linfeng Song, Kun Xu, Yue Zhang, Jianshu Chen, and Dong Yu. 2020. ZPR2: Joint Zero Pronoun Recovery and Resolution Using Multi-Task Learning and BERT. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 5429–5434.Google ScholarGoogle ScholarCross RefCross Ref
  121. Wei Song, Ziyao Song, Lizhen Liu, and Ruiji Fu. 2020. Hierarchical Multi-Task Learning for Organization Evaluation of Argumentative Student Essays. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 3875–3881.Google ScholarGoogle ScholarCross RefCross Ref
  122. Asa Cooper Stickland and Iain Murray. 2019. BERT and PALs: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning. In In Proceedings of the 36th International Conference on Machine Learning (ICML). PMLR, 5986–5995.Google ScholarGoogle Scholar
  123. Sandeep Subramanian, Adam Trischler, Yoshua Bengio, and Christopher J Pal. 2018. Learning General Purpose Distributed Sentence Representations via Large Scale Multi-Task Learning. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  124. Alessandro Suglia, Ioannis Konstas, Andrea Vanzo, Emanuele Bastianelli, Desmond Elliott, Stella Frank, and Oliver Lemon. 2020. CompGuessWhat?!: A Multi-Task Evaluation Framework for Grounded Language Learning. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 7625–7641.Google ScholarGoogle ScholarCross RefCross Ref
  125. Shabnam Tafreshi and Mona Diab. 2018. Emotion Detection and Classification in a Multigenre Corpus with Joint Multi-Task Deep Learning. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 2905–2913.Google ScholarGoogle Scholar
  126. Yi Tay, Zhe Zhao, Dara Bahri, Donald Metzler, and Da-Cheng Juan. 2020. HyperGrid Transformers: Towards A Single Model for Multiple Tasks. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  127. Bing Tian, Yong Zhang, Jin Wang, and Chunxiao Xing. 2019. Hierarchical Inter-Attention Network for Document Classification with Multi-Task Learning. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 3569–3575.Google ScholarGoogle ScholarCross RefCross Ref
  128. Xiaowei Tong, Zhenxin Fu, Mingyue Shang, Dongyan Zhao, and Rui Yan. 2018. One ”Ruler” for All Languages: Multi-Lingual Dialogue Evaluation with Adversarial Multi-Task Learning. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 4432–4438.Google ScholarGoogle Scholar
  129. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. arXiv:1706.03762 [cs] (Dec. 2017). arxiv:1706.03762  [cs]Google ScholarGoogle Scholar
  130. Prashanth Vijayaraghavan, Soroush Vosoughi, and Deb Roy. 2017. Twitter Demographic Classification Using Deep Multi-Modal Multi-Task Learning. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 478–483.Google ScholarGoogle ScholarCross RefCross Ref
  131. Tu Vu, Brian Lester, Noah Constant, Rami Al-Rfou’, and Daniel Cer. 2022. SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, Dublin, Ireland, 5039–5059. https://doi.org/10.18653/v1/2022.acl-long.346Google ScholarGoogle ScholarCross RefCross Ref
  132. Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. 2019. SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems. In Advances in Neural Information Processing Systems, Vol.  32. Curran Associates, Inc.Google ScholarGoogle Scholar
  133. Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. 2019. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In International Conference on Learning Representations 2019. Association for Computational Linguistics, 353–355.Google ScholarGoogle Scholar
  134. Jiancheng Wang, Jingjing Wang, Changlong Sun, Shoushan Li, Xiaozhong Liu, Luo Si, Min Zhang, and Guodong Zhou. 2020. Sentiment Classification in Customer Service Dialogue with Topic-Aware Multi-Task Learning. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 9177–9184.Google ScholarGoogle ScholarCross RefCross Ref
  135. Shaolei Wang, Wangxiang Che, Qi Liu, Pengda Qin, Ting Liu, and William Yang Wang. 2020. Multi-Task Self-Supervised Learning for Disfluency Detection. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 9193–9200.Google ScholarGoogle ScholarCross RefCross Ref
  136. Tianyi Wang, Yating Zhang, Xiaozhong Liu, Changlong Sun, and Qiong Zhang. 2020. Masking Orchestration: Multi-Task Pretraining for Multi-Role Dialogue Representation Learning. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 9217–9224.Google ScholarGoogle ScholarCross RefCross Ref
  137. Weichao Wang, Shi Feng, Wei Gao, Daling Wang, and Yifei Zhang. 2018. Personalized Microblog Sentiment Classification via Adversarial Cross-Lingual Multi-Task Learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 338–348.Google ScholarGoogle ScholarCross RefCross Ref
  138. Yiren Wang, ChengXiang Zhai, and Hany Hassan Awadalla. 2020. Multi-Task Learning for Multilingual Neural Machine Translation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. arxiv:2010.02523Google ScholarGoogle ScholarCross RefCross Ref
  139. Zhen Wang, Rameswar Panda, Leonid Karlinsky, Rogerio Feris, Huan Sun, and Yoon Kim. 2023. Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning. In The Eleventh International Conference on Learning Representations. https://openreview.net/forum?id=Nk2pDtuhTqGoogle ScholarGoogle Scholar
  140. Zirui Wang, Yulia Tsvetkov, Orhan Firat, and Yuan Cao. 2020. Gradient Vaccine: Investigating and Improving Multi-Task Optimization in Massively Multilingual Models. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  141. Taiki Watanabe, Akihiro Tamura, Takashi Ninomiya, Takuya Makino, and Tomoya Iwakura. 2019. Multi-Task Learning for Chemical Named Entity Recognition with Chemical Compound Paraphrasing. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 6244–6249.Google ScholarGoogle Scholar
  142. Jason Wei, Maarten Bosma, Vincent Zhao, Kelvin Guu, Adams Wei Yu, Brian Lester, Nan Du, Andrew M. Dai, and Quoc V Le. 2022. Finetuned Language Models are Zero-Shot Learners. In International Conference on Learning Representations. https://openreview.net/forum?id=gEZrGCozdqRGoogle ScholarGoogle Scholar
  143. Fangzhao Wu and Yongfeng Huang. 2015. Collaborative Multi-Domain Sentiment Classification. In 2015 IEEE International Conference on Data Mining. 459–468.Google ScholarGoogle Scholar
  144. Fangzhao Wu and Yongfeng Huang. 2016. Personalized Microblog Sentiment Classification via Multi-Task Learning. Proceedings of the AAAI Conference on Artificial Intelligence (2016), 7.Google ScholarGoogle ScholarCross RefCross Ref
  145. Lianwei Wu, Yuan Rao, Haolin Jin, Ambreen Nazir, and Ling Sun. 2019. Different Absorption from the Same Sharing: Sifted Multi-Task Learning for Fake News Detection. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 4644–4653.Google ScholarGoogle Scholar
  146. Qingrong Xia, Zhenghua Li, and Min Zhang. 2019. A Syntax-Aware Multi-Task Learning Framework for Chinese Semantic Role Labeling. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 5382–5392.Google ScholarGoogle Scholar
  147. Liqiang Xiao, Honglun Zhang, Wenqing Chen, Yongkun Wang, and Yaohui Jin. 2018. Learning What to Share: Leaky Multi-Task Network for Text Classification. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 2055–2065.Google ScholarGoogle Scholar
  148. Liqiang Xiao, Honglun Zhang, Wenqing Chen, Yongkun Wang, and Yaohui Jin. 2018. MCapsNet: Capsule Network for Text with Multi-Task Learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 4565–4574.Google ScholarGoogle ScholarCross RefCross Ref
  149. Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, and Tao Yu. 2022. UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 602–631. https://doi.org/10.18653/v1/2022.emnlp-main.39Google ScholarGoogle ScholarCross RefCross Ref
  150. Junjie Xing, Kenny Zhu, and Shaodian Zhang. 2018. Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text. In Proceedings of the 27th International Conference on Computational Linguistics. Association for Computational Linguistics, 3619–3630.Google ScholarGoogle Scholar
  151. Shweta Yadav, Asif Ekbal, Sriparna Saha, and Pushpak Bhattacharyya. 2019. A Unified Multi-Task Adversarial Learning Framework for Pharmacovigilance Mining. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 5234–5245.Google ScholarGoogle ScholarCross RefCross Ref
  152. Min Yang, Lei Chen, Xiaojun Chen, Qingyao Wu, Wei Zhou, and Ying Shen. 2019. Knowledge-Enhanced Hierarchical Attention for Community Question Answering with Multi-Task and Adaptive Learning. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 5349–5355.Google ScholarGoogle ScholarCross RefCross Ref
  153. Yongxin Yang and Timothy M Hospedales. 2015. A Unified Perspective on Multi-Domain and Multi-Task Learning. (2015), 9.Google ScholarGoogle Scholar
  154. Wei Ye, Bo Li, Rui Xie, Zhonghao Sheng, Long Chen, and Shikun Zhang. 2019. Exploiting Entity BIO Tag Embeddings and Multi-Task Learning for Relation Extraction with Imbalanced Data. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1351–1360.Google ScholarGoogle ScholarCross RefCross Ref
  155. Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, and Chelsea Finn. 2020. Gradient Surgery for Multi-Task Learning. Advances in Neural Information Processing Systems 33 (2020), 5824–5836.Google ScholarGoogle Scholar
  156. Nasser Zalmout and Nizar Habash. 2019. Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 1775–1786.Google ScholarGoogle ScholarCross RefCross Ref
  157. Poorya Zaremoodi, Wray Buntine, and Gholamreza Haffari. 2018. Adaptive Knowledge Sharing in Multi-Task Learning: Improving Low-Resource Neural Machine Translation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, 656–661.Google ScholarGoogle ScholarCross RefCross Ref
  158. Daojian Zeng, Haoran Zhang, and Qianying Liu. 2020. CopyMTL: Copy Mechanism for Joint Extraction of Entities and Relations with Multi-Task Learning. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 9507–9514.Google ScholarGoogle ScholarCross RefCross Ref
  159. Jiali Zeng, Linfeng Song, Jinsong Su, Jun Xie, Wei Song, and Jiebo Luo. 2020. Neural Simile Recognition with Cyclic Multitask Learning and Local Attention. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05(April 2020), 9515–9522.Google ScholarGoogle ScholarCross RefCross Ref
  160. Honglun Zhang, Liqiang Xiao, Wenqing Chen, Yongkun Wang, and Yaohui Jin. 2018. Multi-Task Label Embedding for Text Classification. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 4545–4553.Google ScholarGoogle ScholarCross RefCross Ref
  161. Honglun Zhang, Liqiang Xiao, Yongkun Wang, and Yaohui Jin. 2017. A Generalized Recurrent Neural Architecture for Text Classification with Multi-Task Learning. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 3385–3391.Google ScholarGoogle ScholarCross RefCross Ref
  162. Shengyu Zhang, Linfeng Dong, Xiaoya Li, Sen Zhang, Xiaofei Sun, Shuhe Wang, Jiwei Li, Runyi Hu, Tianwei Zhang, Fei Wu, et al. 2023. Instruction Tuning for Large Language Models: A Survey. arXiv preprint arXiv:2308.10792(2023).Google ScholarGoogle Scholar
  163. Yuxiang Zhang, Jiamei Fu, Dongyu She, Ying Zhang, Senzhang Wang, and Jufeng Yang. 2018. Text Emotion Distribution Learning via Multi-Task Convolutional Neural Network. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 4595–4601.Google ScholarGoogle ScholarCross RefCross Ref
  164. Yu Zhang and Qiang Yang. 2021. A Survey on Multi-Task Learning. IEEE Transactions on Knowledge and Data Engineering (2021).Google ScholarGoogle ScholarCross RefCross Ref
  165. He Zhao, Longtao Huang, Rong Zhang, Quan Lu, and Hui Xue. 2020. SpanMlt: A Span-Based Multi-Task Learning Framework for Pair-Wise Aspect and Opinion Terms Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 3239–3248.Google ScholarGoogle ScholarCross RefCross Ref
  166. Sendong Zhao, Ting Liu, Sicheng Zhao, and Fei Wang. 2019. A Neural Multi-Task Learning Framework to Jointly Model Medical Named Entity Recognition and Normalization. Proceedings of the AAAI Conference on Artificial Intelligence 33, 01(July 2019), 817–824.Google ScholarGoogle ScholarDigital LibraryDigital Library
  167. Xin Zhao, Kun Zhou, Beichen Zhang, Zheng Gong, Zhipeng Chen, Yuanhang Zhou, Ji-Rong Wen, Jing Sha, Shijin Wang, Cong Liu, and Guoping Hu. 2023. JiuZhang 2.0: A Unified Chinese Pre-Trained Language Model for Multi-Task Mathematical Problem Solving. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (, Long Beach, CA, USA,) (KDD ’23). Association for Computing Machinery, New York, NY, USA, 5660–5672. https://doi.org/10.1145/3580305.3599850Google ScholarGoogle ScholarDigital LibraryDigital Library
  168. Renjie Zheng, Junkun Chen, and Xipeng Qiu. 2018. Same Representation, Different Attentions: Shareable Sentence Representation Learning from Multiple Tasks. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 4616–4622.Google ScholarGoogle ScholarCross RefCross Ref
  169. Wenjie Zhou, Minghua Zhang, and Yunfang Wu. 2019. Multi-Task Learning with Language Modeling for Question Generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 3394–3399.Google ScholarGoogle Scholar
  170. Chenguang Zhu, Michael Zeng, and Xuedong Huang. 2019. Multi-Task Learning for Natural Language Generation in Task-Oriented Dialogue. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, 1261–1266.Google ScholarGoogle Scholar
  171. Jinfeng Zhuang and Yu Liu. 2019. PinText: A Multitask Text Embedding System in Pinterest. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, 2653–2661.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Multi-Task Learning in Natural Language Processing: An Overview

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Computing Surveys
        ACM Computing Surveys Just Accepted
        ISSN:0360-0300
        EISSN:1557-7341
        Table of Contents

        Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Online AM: 11 May 2024
        • Accepted: 18 April 2024
        • Revised: 12 January 2024
        • Received: 25 September 2021

        Check for updates

        Qualifiers

        • survey
      • Article Metrics

        • Downloads (Last 12 months)131
        • Downloads (Last 6 weeks)131

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader