当前位置:网站首页>文本匹配——【NAACL 2022】GPL
文本匹配——【NAACL 2022】GPL
2022-06-30 14:44:00 【用户1621453】
论文地址:https://arxiv.org/abs/2112.07577
《文本匹配——【EMNLP 2021】TSDAE》中的自适应预训练的一大缺点是计算开销高,因为必须首先在语料库上运行预训练,然后在标记的训练数据集上进行监督学习。标记的训练数据集可能非常大。
GPL(用于密集检索的无监督域自适应的生成伪标记)克服了上述问题:它可以应用于微调模型之上。因此,可以使用其中一种预训练模型并将其调整到特定领域:
训练的时间越长,你的模型就越好。在 V100-GPU 上训练模型大约 1 天。GPL 可以与自适应预训练相结合,从而进一步提升性能。
GPL 分三个阶段工作:
- query 生成:对于我们域中的给定文本,我们首先使用 T5 模型为给定文本生成可能的query。例如,当你的文本是“Python is a high-level general-purpose programming language”时,模型可能会生成类似“What is Python”这样的query。中文T5 Doc2Query 预训练模型地址 :https://huggingface.co/doc2query/msmarco-chinese-mt5-base-v1
- 负例挖掘:接下来,对于生成query “What is Python”,我们从语料库中挖掘负例passage,即与query 相似但用户认为不相关的 passage。这样的负例 passage 可能是“Java is a high-level, class-based, object-oriented programming language.”。. 我们使用密集检索进行这种挖掘,即我们使用现有的文本嵌入模型之一并检索给定query 的相关passage。
- 伪标签:在负例挖掘步骤中,我们检索到与query 实际相关的passage(如 “What is Python” 的另一个定义)。为了克服这个问题,我们使用 Cross-Encoder 对所有(query、passage)对进行评分。
训练:一旦我们有了三元组 (generated query, positive passage, mined negative passage) 和对 (query, positive) 、 (query, negative) 的评分的Cross-Encoder,我们就可以开始使用MarginMSELoss训练文本嵌入模型:
伪标记步骤非常重要,与之前的方法 QGen(《文本匹配——【NeurIPS 2021】BEIR》) 相比,它提高了性能,QGen 将 passages 视为正(1)或负(0)。正如我们在下图中看到的,对于生成query (“what is futures conrtact”),负例挖掘步骤检索与生成query 部分或高度相关的passages。使用 MarginMSELoss 和Cross-Encoder,我们可以识别这些 passages 并教导文本嵌入模型这些段落也与给定查询相关。
下表概述了 GPL 与自适应预训练(MLM 和 TSDAE)的比较。如前所述,GPL 可以与自适应预训练相结合:
边栏推荐
- Component communication mode
- Go language func function
- For loop and promise to solve the problem of concurrent callback
- ThinkPHP show method parameter controllable command execution
- Fastcgi CGI shallow understanding
- Laravel upload error
- Is it troublesome for CITIC futures to open an account? Is it safe? How much is the handling charge for opening an account for futures? Can you offer a discount
- Clear the route cache in Vue
- Double pointer letter matching
- @PathVariable
猜你喜欢

ThinkPHP show method parameter controllable command execution

Querywrapper in mybaits plus

Laravel upload error

Component communication mode

Zend studio how to import an existing project

I love network security for new recruitment assessment
![[geek challenge 2019] PHP problem solving record](/img/bf/038082e8ee1c91eaf6e35add39f760.jpg)
[geek challenge 2019] PHP problem solving record

ThinkPHP v3.2 comment annotation injection write shell

How does hbuilder display in columns?

ctfshow nodejs
随机推荐
Location of dichotomy
Go common lock mutex and rwmutex
CCF command line options (Full Score code + problem solving ideas + skill summary) March 3, 2014
Detailed explanation of settimeout() and setinterval()
Introduction to the construction and development of composer private warehouse
Laravel RBAC laravel permission use
DiceCTF - knock-knock
CCF image rotation (Full Score code + problem solving idea) 201503-01
Finding the median of two arrays by dichotomy
Binary rotation array (1)
Use PHP to delete the specified text content in the file
Implement a long-click list pop-up box on apiccloud
Wechat applet realizes map navigation + door-to-door recycling
PHP common authentication / third-party methods
MySQL back to table query optimization
Notepad regular delete the line of the keyword
PHP 2D array change key name
Effect of shadow around the block after mouse over
jsPlumb. Deleteeveryconnection is not a function & jsplumb clear canvas jsplumb delete all nodes and all connections
Double pointer palindrome string