Abstract
Sequence-to-sequence (seq2seq) models are widely used in abstractive text summarization tasks. The decoder of the traditional model leverages attention mechanism to generate the summary by taking the hidden state of each word as the complete semantic information of the original text. However, the hidden state of an output word only contains the semantic information of the words before and after it, which means that the semantic information of the original text is not completely captured. As a result, the generated summary lacks the important information of the original text, which affects the accuracy and readability of the abstract. To address this issue, in this paper, TEA, a topic-information-based extractive and abstractive fusion model, is proposed to generate the summary. The model consists of two modules, the BERT-based extractive module and the seq2seq-based abstractive module. The extractive module conducts sequential annotation on sentence level while the abstractive module uses the pointer-generator network to generate the summary. In the process of generating the summary, combined with the attention mechanism based on the topic information, the TextRank algorithm is employed to select N keywords, and the similarity between the keywords and the original text is calculated through the attention function, which is regarded as the weight of the topic encoding in the attention mechanism. Experimental results on Chinese dataset show that, compared with the state-of-the-art text summarization models, our proposed model effectively improves the compatibility between the generated text summary and the original text, and summarizes the content of the original text better. Further, the values of ROUGE-1, ROUGE-2 and ROUGE-L are increased by 2.07%, 3.94% and 3.53%.
Similar content being viewed by others
References
Badrinath, R., Venkatasubramaniyan, S., & Madhavan, C. E. V. (2011). Improving query focused summarization using look-ahead strategy. In: P. D. Clough, C. Foley, C. Gurrin, G. J. F. Jones, W. Kraaij, H. Lee, & V Murdock (Eds.) Advances in Information Retrieval - 33rd European Conference on IR Research, ECIR 2011, Dublin, Ireland, April 18-21. Proceedings, (pp. 641–652)
Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv:1409.0473
Chopra, S., Auli, M., & Rush, A. M. (2016) Abstractive sentence summarization with attentive recurrent neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (pp. 93–98)
Cui, Y., Che, W., Liu, T., Qin, B., Yang, Z., Wang, S., & Hu, G. (2019). Pre-training with whole word masking for chinese BERT. arXiv:1906.08101
Devlin, J., Chang, M. -W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805
Ding, J., Li, Y., Ni, H., & Yang, Z. (2020). Generative text summary based on enhanced semantic attention and gain-benefit gate. IEEE Access, 8, 92659–92668. https://doi.org/10.1109/ACCESS.2020.2994092
Gu, J., Lu, Z., Li, H., & Li, V. O. (2016). Incorporating copying mechanism in sequence-to-sequence learning. arXiv:1603.06393
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, (pp. 770–778)
Hermann, K. M., Kociský, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M., & Blunsom, P. (2015). Teaching machines to read and comprehend. In: C Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, & R. Garnett (Eds.) Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, (pp. 1693–1701). https://proceedings.neurips.cc/paper/2015/hash/afdec7005cc9f14302cd0474fd0f3c96-Abstract.html
LeCun, Y., Bengio, Y., & et al. (1995). Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10), 1995.
Lin, J., Sun, X., Ma, S., & Su, Q. (2018). Global encoding for abstractive summarization. In: I. Gurevych, Y. Miyao (Eds.) Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Short Papers, (vol. 2, pp. 163–169)
Liu, Y. (2019). Fine-tune BERT for extractive summarization. arXiv:1903.10318
Luong, T., Pham, H., & Manning, C. D. (2015). Effective approaches to attention-based neural machine translation. In: L. Màrquez, C. Callison -Burch, J. Su, D. Pighin, Y. Marton (Eds.) Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, (pp. 1412–1421)
Mihalcea, R. (2004). Graph-based ranking algorithms for sentence extraction, applied to text summarization. In: Proceedings of the ACL Interactive Poster and Demonstration Sessions, (pp. 170–173)
Nallapati, R., Zhai, F., & Zhou, B. (2017). Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In: Proceedings of the AAAI Conference on Artificial Intelligence, (vol. 31)
Nallapati, R., Zhou, B., dos Santos, C. N., Gülçehre, Ç., & Xiang, B. (2016). Abstractive text summarization using sequence-to-sequence rnns and beyond. In: Y. Goldberg, S. Riezler (Eds.) Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, CoNLL 2016, Berlin, Germany, August 11-12, 2016, (pp. 280–290)
Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The pagerank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford InfoLab. Previous number = SIDL-WP-1999-0120. http://ilpubs.stanford.edu:8090/422/
Peng, D., Wang, Y., Liu, C., & Chen, Z. (2020). TL-NER: A transfer learning model for chinese named entity recognition. Information Systems Frontiers, 22(6), 1291–1304. https://doi.org/10.1007/s10796-019-09932-y
Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), (pp. 1532–1543)
Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations. In: M. A. Walker, H. Ji, & A. Stent (Eds.) Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018, New Orleans, Louisiana, USA, June 1-6, 2018 (Long Papers), (vol. 1, pp. 2227–2237)
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training
Rush, A. M., Chopra, S., & Weston, J. (2015). A neural attention model for abstractive sentence summarization. In: L. Màrquez, C. Callison-Burch, J. Su, D. Pighin, & Y. Marton (Eds.) Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, (pp. 379–389)
See, A., Liu, P. J., & Manning, C. D. (2017). Get to the point: Summarization with pointer-generator networks. In: R. Barzilay, M. Kan (Eds.) Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Long Papers, (vol. 1, pp. 1073–1083)
van den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A. W., & Kavukcuoglu, K. (2016). Wavenet: A generative model for raw audio. arXiv:1609.03499
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. In: I. Guyon, U. von Luxburg, S. Bengio, H. M. Wallach, Fergus, R., S. V. N. Vishwanathan, & R. Garnett (Eds.) Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, (pp. 5998–6008). https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
Vinyals, O., Fortunato, M., & Jaitly, N. (2015) Pointer networks. arXiv:1506.03134
Yasunaga, M., Zhang, R., Meelu, K., Pareek, A., Srinivasan, K., & Radev, D. R. (2017). Graph-based neural multi-document summarization. arXiv:1706.06681
Yu, F., & Koltun, V. (2016). Multi-scale context aggregation by dilated convolutions. In: Y. Bengio, Y. LeCun (Eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. arXiv:1511.07122
Zhou, Q., Yang, N., Wei, F., & Zhou, M. (2017). Selective encoding for abstractive sentence summarization. In: R. Barzilay, M. Kan (Eds.) Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, Vancouver, Canada, July 30 - August 4, Volume 1: Long Papers, (pp. 1095–1104)
Acknowledgements
The work is supported by the National Natural Science Foundation of China under Grant No.61772342. We would like to express our special thanks to the members in our lab for their valuable discussion on this work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
We declare that we do not have any commercial or associative interest that represents a conflict of interest in connection with the work submitted.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Peng, D., Yu, B. TEA: Topic Information based Extractive-Abstractive Fusion Model for Long Text Summary. Inf Syst Front (2023). https://doi.org/10.1007/s10796-023-10442-1
Accepted:
Published:
DOI: https://doi.org/10.1007/s10796-023-10442-1