当前位置:网站首页>Aiphacode is not a substitute for programmers, but a tool for developers
Aiphacode is not a substitute for programmers, but a tool for developers
2022-07-02 10:15:00 【AI technology base camp】

compile | Hemu wood
Produce | AI Technology base (ID:rgznai100)
DeepMind yes AI Research Laboratory , It introduces a deep learning model , Can generate software source code with significant effect . The model is called AIphaCode, Is based on Transformers,OpenAI The same architecture is used in its code generation model .
Programming is one of the promising applications of deep learning and large language model . The growing demand for programming talent has stimulated the competition to create tools , These tools can improve the efficiency of developers , And provide tools for non developers to create software .
And in this respect ,AIphaCode It's really impressive . It successfully solves complex programming challenges , These challenges usually require hours of planning 、 Coding and testing . It may be a good tool to turn problem descriptions into working code .
But it's not equivalent to any level of human programmer . This is a completely different way of creating software , Without human thinking and intuition , This method is incomplete .

Coding competition

Examples of coding challenge descriptions . The picture is from DeepMind
AIphaCode It's not the only one , But it accomplishes a very complex task . Other similar systems focus on generating short snippets of code , For example, functions or code blocks that perform small tasks ( for example , Set up Web The server , from API Extract information from the system ). Although impressive , But when the language model is exposed to a large enough source code corpus , These tasks become insignificant .
On the other hand ,AIphaCode Designed to solve competitive programming problems . Participants in the coding challenge must read the challenge description , Understand the problem , Turn it into an algorithmic solution , In a common language , And evaluate a limited set of test cases . Last , Their results are evaluated based on the performance of hidden tests that are not available during implementation . Coding challenges can also have other conditions , For example, time and memory limitations .
Basically , The machine learning model involved in the coding challenge must generate a complete program , To solve its unprecedented problems .

Examples of coding challenge solutions . The picture is from DeepMind

Transformer And the power of large language models
AlphaCode It is another example of the progress made by large language models in solving complex problems .AlphaCode It is another example of the progress made by large language models in solving complex problems . This deep learning system is often called sequence to sequence model (Seq2seq).Seq2seq The algorithm takes a series of values ( Letter 、 Pixels 、 Numbers etc. ) As input , And generate another sequence of values . This is machine translation 、 Methods used in many natural language tasks such as text generation and speech recognition .
according to DeepMind The paper of ,AlphaCode An encoder is used - decoder Transformer framework . In recent years ,Transformer Become particularly popular , Because they can handle a large number of data sequences , And compared with its predecessor, cyclic neural network (RNN) And long and short term memory networks (LSTM) Much less memory and computing required .

Transformer The structure of the network
AlphaCode The encoder part of creates a digital representation for the natural language description of the problem . The decoder part obtains the embedded vector generated by the encoder , And try to generate the source code of the solution .
The fact proved that ,Transformer The model is good at such tasks , Especially when they are provided with enough training data and computing power . But in the opinion of researchers ,AlphaCode The real excellence of is not just the powerful function of putting raw data into super large neural networks , It's more about DeepMind The ingenuity of scientists in designing the training process and the algorithms that generate and filter it .

Unsupervised and supervised learning
In order to create AlphaCode,DeepMind Scientists combine unsupervised pre training with supervised fine-tuning . It is often called self supervised learning , This approach has become popular in expensive and time-consuming applications that do not have enough labeled data or data annotations .
In the pre training phase ,AlphaCode From the GitHub Extracted 715GB The data were unsupervised . Train the model by trying to predict the missing parts of the language or code fragment . The advantage of this approach is that it does not require any type of annotation , And by contacting more and more samples ,ML Models are better at creating numerical representations of the structure of text and source code .

Training and Application AlphaCode The algorithm of . The picture is from DeepMind
And then in CodeContests(DeepMind Annotated dataset created by the team ) Fine tune the pre training model . This dataset contains problem statements 、 Collection of test cases and errors from various sources , Include Codeforces、Description2Code and IBM Of CodeNet. The model has been trained , The text description of the challenge can be converted into the generated source code . Its results are evaluated through test cases , And compare it with the correct submission .
When creating a dataset , Researchers pay particular attention to avoiding training 、 Historical overlap between validation and test sets . This ensures that the ML The model will not produce memory results when facing coding challenges .

Code generation and filtering
once AlphaCode Trained , It will test for problems that have never been encountered before . When AlphaCode When dealing with a new problem , It will produce many solutions . then , It uses a filtering algorithm to select the best 10 Candidates and submit them to the competition . If at least one of them is correct , It is considered that the problem has been solved .
according to DeepMind The paper of ,AlphaCode Millions of samples can be generated for each problem , Although it usually generates thousands of solutions . Then filter the sample , Include only those samples that pass the tests included in the problem statement . According to the paper , This will delete about 99% Generated samples of , But there are still thousands of valid samples left .
In order to optimize the sample selection process , Use clustering algorithm to group solutions . According to the researchers , The clustering process tends to group work solutions together . This makes it easier to find a small number of candidates who may pass the competitive concealment test .
according to DeepMind That's what I'm saying , In fashion Codeforces When testing in the actual programming competition on the platform ,AlphaCode Top average 54%, Considering the difficulty of coding challenges , It's very impressive .


AI VS human beings
DeepMind My blog correctly points out that ,AlphaCode Is the first “ Achieve competitive performance levels in programming competitions ” Of AI Code generation system .
However , However, some people mistook this statement for artificial intelligence coding “ As good as human programmers ” It is fallacious to compare the narrow sense of artificial intelligence with the general ability of human beings to solve problems .
for example ,DeepBlue and AlphaGo, They are artificial intelligence systems that beat the world champions of chess and go . Although both systems are remarkable achievements in computer science and artificial intelligence , But they are only good at one task . They cannot compete with human rivals in any other task that requires careful planning and strategy , These are the skills that humans acquired before becoming masters of chess and go .
It can also be said about competitive programming . A programmer who has reached a competitive level in coding challenges has spent years learning . They can think abstractly , Solve simpler challenges , Write simple programs , And show many other skills that are taken for granted and not evaluated in programming competitions .
In short , These competitions are designed for human beings . You can be sure , Generally speaking , The person at the top of competitive programming is a good programmer . That's why many companies use these challenges to make recruitment decisions .
On the other hand ,AlphaCode Is a shortcut to competitive programming —— Although it's excellent . It creates novel code , Will not copy and paste from their training data . But it's not the same as ordinary programmers .
therefore , It's not about letting AlphaCode Compete with programmers , We should pay more attention to AlphaCode And other things like that AI More interested in what the system can do when working with human programmers . These tools can have a huge impact on programmer productivity . They may even change the programming culture , Turn human beings to formulate problems ( It is still a discipline in the field of human intelligence ) And let the AI system generate code .
But programmers will remain in control , They must learn to use the power and limitations of artificial intelligence to generate code .
Reference link :
https://thenextweb.com/news/deepmind-alphacode-tool-not-replacement-for-human-programmers-syndication


Go to
period
return
Gu
technology
How to use Python Hide the data in the image
information
Lose again to AI, Overtake quickly
technology
use Python Draw the cartoon image of Gu ailing
technology
Python Visualization is close to 90 Baidu search of days

Share

Point collection

A little bit of praise

Click to see
边栏推荐
- Summary of demand R & D process nodes and key outputs
- 2837xd code generation - Supplement (2)
- 2837xd code generation module learning (2) -- ADC, epwm module, timer0
- Illusion -- Animation blueprint, state machine production, character walking, running and jumping action
- Project practice, redis cluster technology learning (12)
- UE4夜间打光笔记
- Cmake command - Official Document
- 2837xd Code Generation - Supplement (1)
- 【UE5】动画重定向:如何将幻塔人物导入进游戏玩耍
- 【UE5】AI随机漫游蓝图两种实现方法(角色蓝图、行为树)
猜你喜欢

2837xd 代碼生成——補充(1)

Bookmark collection management software suspension reading and data migration between knowledge base and browser bookmarks

阿里云SLS日志服务

Matlab代码生成之SIL/PIL测试

Leetcode -- the nearest common ancestor of 236 binary tree

Ue5 - AI pursuit (blueprint, behavior tree)

ESLint 报错

Bugkuctf-web21 (detailed problem solving ideas and steps)
Brief analysis of edgedb architecture

Sil/pil test of matlab code generation
随机推荐
Project practice, redis cluster technology learning (6)
UE4 night lighting notes
Blender模型导入ue、碰撞设置
2837xd代码生成模块学习(2)——ADC、ePWM模块、Timer0
Ue5 - ai Pursuit (Blueprint, Behavior tree)
[unreal] animation notes of the scene
Blender多鏡頭(多機比特)切換
Basic notes of illusory AI blueprint (10000 words)
The latest progress and development trend of 2022 intelligent voice technology
[ue5] animation redirection: how to import magic tower characters into the game
Error reporting on the first day of work (incomplete awvs unloading)
Project practice, redis cluster technology learning (12)
[200 Shengxin literatures] 96 joint biomarkers of immune checkpoint inhibitor response in advanced solid tumors
What is call / cc- What is call/cc?
Matlab generates DSP program -- official routine learning (6)
Alibaba cloud Prometheus monitoring service
[ue5] two implementation methods of AI random roaming blueprint (role blueprint and behavior tree)
[illusory] automatic door blueprint notes
Tee command usage example
How does {} prevent SQL injection? What is its underlying principle?