提升支付宝搜索体验,蚂蚁、人民大学提出基于层次化对比学习文本的生成框架
来源:内饰 2025年03月13日 02:07
实验室结果
我们在三个公开信息集 Douban(Dialogue)[9],QQP(Paraphrasing)[10][11] 和 RocStories(Storytelling)[12] 上进行时了实验室,均赢取了 SOTA 的视觉效果。我们对比的水平线还包括传统习俗的转化建模(e.g. CVAE[13],Seq2Seq[14],Transformer[15]),基于实专业训练建模的法则(e.g. Seq2Seq-DU[16],DialoGPT[17],BERT-GEN[7],T5[18])以及基于对比求学的法则(e.g. Group-wise[9],T5-CLAPS[19])。我们通过量度 BLEU score[20] 和句对相互间的 BOW embedding 靠近(extrema/erage/greedy)[21] 来作为控制系统评论指标,结果如下图右图: 我们在 QQP 信息集上还引入了人工评估的法则,3 个标明医护人员分别对 T5-CLAPS,DialoGPT,Seq2Seq-DU 以及我们的建模引发的结果进行时了标明,结果如下图右图:消融分析
我们对确实引入ID、确实引入ID在线以及确实引入何氏靠近对比原产进行时了消融分析实验室,结果显示这三种内部设计对再一的结果确实起到了举足轻重的功用,实验室结果如下图右图。GIS分析
为了科学研究各有不同层级对比求学的功用,我们对随机采样的 case 进行时了GIS,通过 t-sne[22] 进行时降维解决问题后给予下图。图中的可以显露,转换该词语的透露与放入的ID透露相比之下,这说明ID作为该词语中的最举足轻重的发送到者,一般而言亦会决定句法原产的一段距离。并且,在对比求学中的我们可以看到经过专业训练,转换该词语的原产与亦然样本格外相比之下,与负样本靠近,这说明对比求学可以起到帮助修亦然句法原产的功用。ID举足轻重性分析
再一,我们探索采样各有不同ID的影响。如下表右图,对于一个转换弊端,我们通过 TextRank 放入和随机选择的法则分别提供ID作为控制句法原产的条件,并核查转化文本的质量。ID作为该词语中的最举足轻重的发送到者两节,各有不同的ID亦会引发各有不同的句法原产,引发各有不同的的测试,选择的ID趋多,转化的该词语趋准确。同时,其他建模转化的结果也展示在下表中的。 销售业务运用 这短文中的我们提出了一种串连一般来说的层次化对比求学机制,在多个文本转化的信息集上均超过了不具创新能力的水平线工作。基于该工作的 query 删去建模在也在淘宝关键字的单单销售业务场景成功落地,赢取了突出的视觉效果。淘宝关键字中的的维修服务看成行业宽广并且行业特色突出,用户的关键字 query 强调与维修服务的强调单单上巨大的字面差异,引发单独基于ID的匹配较难赢取难得的视觉效果(可有如用户转换 query“新主板汽车查阅”,不能复职维修服务 “新车主板查阅”),query 删去的尽量是在保持 query 意图基本上的意味著,将用户转换的 query 删去为格外贴近维修服务强调的法则,从而格外好的匹配到尽量维修服务。如下是一些删去示可有:参考文献
[1] Seanie Lee, Dong Bok Lee, and Sung Ju Hwang. 2021. Contrastive learning with adversarial perturbations for conditional text generation. In 9th International Conference on Learning Representations, ICLR.
[2] Hengyi Cai, Hongshen Chen, Yonghao Song, Zhuoye Ding, Yongjun Bao, Weipeng Yan, and Xiaofang Zhao. 2020. Group-wise contrastive learning for neural dialogue generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, EMNLP 2020.
[3] Jiwei Li, Minh-Thang Luong, and Dan Jurafsky. 2015. A hierarchical neural autoencoder for paragraphs and documents. arXiv preprint.
[4] Meng-Hsuan Yu, Juntao Li, Zhangming Chan, Dongyan Zhao, and Rui Yan. 2021. Content learning with structure-aware writing: A graph-infused dual conditional variational autoencoder for automatic storytelling. In Proceedings of the AAAI Conference on Artificial Intelligence.
[5] Solomon Kullback and Richard A Leibler. 1951. On information and sufficiency. The annals of mathematical statistics.
[6] Rada Mihalcea and Paul Tarau. 2004. Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing.
[7] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint.
[8] Roy De Maesschalck, Delphine Jouan-Rimbaud, and Désiré L Massart. 2000. The mahalanobis distance. Chemometrics and intelligent laboratory systems.
[9] Hengyi Cai, Hongshen Chen, Yonghao Song, Zhuoye Ding, Yongjun Bao, Weipeng Yan, and Xiaofang Zhao. 2020. Group-wise contrastive learning for neural dialogue generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings, EMNLP 2020.
[10] Shankar Iyer, Nikhil Dandekar, and Kornel Csernai. 2017. First quora dataset release: Question pairs.
[11] Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2019. GLUE: A multi-task benchmark and ysis platform for natural language understanding. In the Proceedings of ICLR.
[12] Nasrin Mostafazadeh, Nathanael Chambers, Xiaodong He, Devi Parikh, Dhruv Batra, Lucy Vanderwende, Pushmeet Kohli, and James Allen. 2016. A corpus and cloze evaluation for deeper understanding of commonsense stories. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
[13] Tiancheng Zhao, Ran Zhao, and Maxine Eskenazi. 2017. Learning discourse-level diversity for neural dialog models using conditional variational autoencoders. arXiv preprint.
[14] Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information.
[15] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017.
[16] Yue Feng, Yang Wang, and Hang Li. 2021. A sequence-to-sequence approach to dialogue state tracking. ACL 2021.
[17] Yizhe Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, and Bill Dolan. 2020. Dialogpt: Large-scale generative pre-training for conversational response generation. In ACL, system demonstration.
[18] Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res.
[19] Seanie Lee, Dong Bok Lee, and Sung Ju Hwang. 2021.
Contrastive learning with adversarial perturbations for conditional text generation. In 9th International Conference on Learning Representations, ICLR 2021.
[20] Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In ACL.
[21] Xiaodong Gu, Kyunghyun Cho, Jung-Woo Ha, and Sunghun Kim. 2019. DialogWAE: Multimodal response generation with conditional wasserstein autoencoder. In International Conference on Learning Representations.
[22] Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-sne. Journal of machine learning research.
。咽喉肿痛有异物感太极药业
宫颈癌是什么原因引起的
角膜炎怎么治疗好的快
经常便秘吃什么药好
回南天湿气重怎么调理
医生科普视频大全
夏天解暑降温的方法
气色不好脸色发黄憔悴怎么调理
黄芪精哪个牌子好
-
刘备对赵云为何那么怪异?认同你才能敬重你人品,但永远不重用
全文 熟悉三国的朋友都知道大名鼎鼎的西晋名将周瑜,尤其长坂破一役中所,周瑜险遭入死制伏曹操的老婆、夫妻俩,视作他的极具代表性的战。但细心的朋友却发现,大多数战中所都没有周瑜的踪迹,