当前位置:网站首页>【AI4Code】《IntelliCode Compose: Code Generation using Transformer》 ESEC/FSE 2020
【AI4Code】《IntelliCode Compose: Code Generation using Transformer》 ESEC/FSE 2020
2022-07-25 11:11:00 【chad_lee】
《IntelliCode Compose: Code Generation using Transformer》 ESEC/FSE 2020
不仅仅是生成一个词,而是生成一行。用的是GPT-2。数据集是12亿行Python, C#, Javascript, TypeScript语言的代码
Byte-Pair Encoding (BPE)
对序列token化的处理,一个是用subtoken来缩小词表,一个是屏蔽字符串以防止敏感数据泄漏。
IntelliCode Compose
模型用的是GPT,在推断的时候将sequence decoding的过程视为树的搜索过程,直至 token出现:
生成树的时候使用beam search,beam with为K,假设最终生成的序列长度为L,模型一共需要预测 K*L 次,但是模型可以batch执行,所以一共只需要L次。
Multilingual model
比较了四种建模多语言的方式:
1)忽略语言之间的不同,用统一的模型训练多种语言【实验表明:这种方式比单独对单语言训练效果更差】
2)加入language type embedding信息,每种语言用一个向量表示,和原本的token embedding等结合。
3)在每个训练样本的最开始加上一句"lang * remaining token sequence",其中 l a n g ∈ { P y t h o n , C # , J a v a S c r i p t , T y p e S c r i p t } lang \in \{Python, C\#, JavaScript,TypeScript\} lang∈{ Python,C#,JavaScript,TypeScript}
4)在预训练时,加入一个language type classification任务,即多一个head,每次预测该语言的类型。
边栏推荐
- Javescript loop
- 30 sets of Chinese style ppt/ creative ppt templates
- Intelligent information retrieval(智能信息检索综述)
- Menu bar + status bar + toolbar ==pyqt5
- 程序员送给女孩子的精美礼物,H5立方体,唯美,精致,高清
- Introduction to redis
- [GCN multimodal RS] pre training representations of multi modal multi query e-commerce search KDD 2022
- 【多模态】《HiT: Hierarchical Transformer with Momentum Contrast for Video-Text Retrieval》ICCV 2021
- The applet image cannot display Base64 pictures. The solution is valid
- Multi-Label Image Classification(多标签图像分类)
猜你喜欢

【多模态】《HiT: Hierarchical Transformer with Momentum Contrast for Video-Text Retrieval》ICCV 2021

11. Reading rumors spread with deep learning
![[MySQL learning 08]](/img/9e/6e5f0c4c956ca8dc31d82560262013.png)
[MySQL learning 08]

Meta-learning(元学习与少样本学习)

PHP curl post x-www-form-urlencoded

Start with the development of wechat official account

OSPF综合实验

JS process control

Varest blueprint settings JSON

PHP curl post length required error setting header header
随机推荐
LeetCode 50. Pow(x,n)
toString()与new String()用法区别
Review in the middle of 2022 | understand the latest progress of pre training model
Objects in JS
Hardware connection server TCP communication protocol gateway
对比学习的应用(LCGNN,VideoMoCo,GraphCL,XMC-GAN)
Web APIs (get element event basic operation element)
W5500 adjusts the brightness of LED light band through upper computer control
Qin long, a technical expert of Alibaba cloud: a prerequisite for reliability assurance - how to carry out chaos engineering on the cloud
Power Bi -- these skills make the report more "compelling"“
JS 面试题:手写节流(throttle)函数
Brpc source code analysis (V) -- detailed explanation of basic resource pool
Layout management ==pyqt5
Learning to Pre-train Graph Neural Networks(图预训练与微调差异)
JS scope and pre parsing
Miidock Brief
Transformer变体(Sparse Transformer,Longformer,Switch Transformer)
Onenet platform control w5500 development board LED light
Menu bar + status bar + toolbar ==pyqt5
brpc源码解析(七)—— worker基于ParkingLot的bthread调度