当前位置:网站首页>Aiphacode is not a substitute for programmers, but a tool for developers
Aiphacode is not a substitute for programmers, but a tool for developers
2022-07-02 10:15:00 【AI technology base camp】
compile | Hemu wood
Produce | AI Technology base (ID:rgznai100)
DeepMind yes AI Research Laboratory , It introduces a deep learning model , Can generate software source code with significant effect . The model is called AIphaCode, Is based on Transformers,OpenAI The same architecture is used in its code generation model .
Programming is one of the promising applications of deep learning and large language model . The growing demand for programming talent has stimulated the competition to create tools , These tools can improve the efficiency of developers , And provide tools for non developers to create software .
And in this respect ,AIphaCode It's really impressive . It successfully solves complex programming challenges , These challenges usually require hours of planning 、 Coding and testing . It may be a good tool to turn problem descriptions into working code .
But it's not equivalent to any level of human programmer . This is a completely different way of creating software , Without human thinking and intuition , This method is incomplete .
Coding competition
Examples of coding challenge descriptions . The picture is from DeepMind
AIphaCode It's not the only one , But it accomplishes a very complex task . Other similar systems focus on generating short snippets of code , For example, functions or code blocks that perform small tasks ( for example , Set up Web The server , from API Extract information from the system ). Although impressive , But when the language model is exposed to a large enough source code corpus , These tasks become insignificant .
On the other hand ,AIphaCode Designed to solve competitive programming problems . Participants in the coding challenge must read the challenge description , Understand the problem , Turn it into an algorithmic solution , In a common language , And evaluate a limited set of test cases . Last , Their results are evaluated based on the performance of hidden tests that are not available during implementation . Coding challenges can also have other conditions , For example, time and memory limitations .
Basically , The machine learning model involved in the coding challenge must generate a complete program , To solve its unprecedented problems .
Examples of coding challenge solutions . The picture is from DeepMind
Transformer And the power of large language models
AlphaCode It is another example of the progress made by large language models in solving complex problems .AlphaCode It is another example of the progress made by large language models in solving complex problems . This deep learning system is often called sequence to sequence model (Seq2seq).Seq2seq The algorithm takes a series of values ( Letter 、 Pixels 、 Numbers etc. ) As input , And generate another sequence of values . This is machine translation 、 Methods used in many natural language tasks such as text generation and speech recognition .
according to DeepMind The paper of ,AlphaCode An encoder is used - decoder Transformer framework . In recent years ,Transformer Become particularly popular , Because they can handle a large number of data sequences , And compared with its predecessor, cyclic neural network (RNN) And long and short term memory networks (LSTM) Much less memory and computing required .
Transformer The structure of the network
AlphaCode The encoder part of creates a digital representation for the natural language description of the problem . The decoder part obtains the embedded vector generated by the encoder , And try to generate the source code of the solution .
The fact proved that ,Transformer The model is good at such tasks , Especially when they are provided with enough training data and computing power . But in the opinion of researchers ,AlphaCode The real excellence of is not just the powerful function of putting raw data into super large neural networks , It's more about DeepMind The ingenuity of scientists in designing the training process and the algorithms that generate and filter it .
Unsupervised and supervised learning
In order to create AlphaCode,DeepMind Scientists combine unsupervised pre training with supervised fine-tuning . It is often called self supervised learning , This approach has become popular in expensive and time-consuming applications that do not have enough labeled data or data annotations .
In the pre training phase ,AlphaCode From the GitHub Extracted 715GB The data were unsupervised . Train the model by trying to predict the missing parts of the language or code fragment . The advantage of this approach is that it does not require any type of annotation , And by contacting more and more samples ,ML Models are better at creating numerical representations of the structure of text and source code .
Training and Application AlphaCode The algorithm of . The picture is from DeepMind
And then in CodeContests(DeepMind Annotated dataset created by the team ) Fine tune the pre training model . This dataset contains problem statements 、 Collection of test cases and errors from various sources , Include Codeforces、Description2Code and IBM Of CodeNet. The model has been trained , The text description of the challenge can be converted into the generated source code . Its results are evaluated through test cases , And compare it with the correct submission .
When creating a dataset , Researchers pay particular attention to avoiding training 、 Historical overlap between validation and test sets . This ensures that the ML The model will not produce memory results when facing coding challenges .
Code generation and filtering
once AlphaCode Trained , It will test for problems that have never been encountered before . When AlphaCode When dealing with a new problem , It will produce many solutions . then , It uses a filtering algorithm to select the best 10 Candidates and submit them to the competition . If at least one of them is correct , It is considered that the problem has been solved .
according to DeepMind The paper of ,AlphaCode Millions of samples can be generated for each problem , Although it usually generates thousands of solutions . Then filter the sample , Include only those samples that pass the tests included in the problem statement . According to the paper , This will delete about 99% Generated samples of , But there are still thousands of valid samples left .
In order to optimize the sample selection process , Use clustering algorithm to group solutions . According to the researchers , The clustering process tends to group work solutions together . This makes it easier to find a small number of candidates who may pass the competitive concealment test .
according to DeepMind That's what I'm saying , In fashion Codeforces When testing in the actual programming competition on the platform ,AlphaCode Top average 54%, Considering the difficulty of coding challenges , It's very impressive .
AI VS human beings
DeepMind My blog correctly points out that ,AlphaCode Is the first “ Achieve competitive performance levels in programming competitions ” Of AI Code generation system .
However , However, some people mistook this statement for artificial intelligence coding “ As good as human programmers ” It is fallacious to compare the narrow sense of artificial intelligence with the general ability of human beings to solve problems .
for example ,DeepBlue and AlphaGo, They are artificial intelligence systems that beat the world champions of chess and go . Although both systems are remarkable achievements in computer science and artificial intelligence , But they are only good at one task . They cannot compete with human rivals in any other task that requires careful planning and strategy , These are the skills that humans acquired before becoming masters of chess and go .
It can also be said about competitive programming . A programmer who has reached a competitive level in coding challenges has spent years learning . They can think abstractly , Solve simpler challenges , Write simple programs , And show many other skills that are taken for granted and not evaluated in programming competitions .
In short , These competitions are designed for human beings . You can be sure , Generally speaking , The person at the top of competitive programming is a good programmer . That's why many companies use these challenges to make recruitment decisions .
On the other hand ,AlphaCode Is a shortcut to competitive programming —— Although it's excellent . It creates novel code , Will not copy and paste from their training data . But it's not the same as ordinary programmers .
therefore , It's not about letting AlphaCode Compete with programmers , We should pay more attention to AlphaCode And other things like that AI More interested in what the system can do when working with human programmers . These tools can have a huge impact on programmer productivity . They may even change the programming culture , Turn human beings to formulate problems ( It is still a discipline in the field of human intelligence ) And let the AI system generate code .
But programmers will remain in control , They must learn to use the power and limitations of artificial intelligence to generate code .
Reference link :
https://thenextweb.com/news/deepmind-alphacode-tool-not-replacement-for-human-programmers-syndication
Go to
period
return
Gu
technology
How to use Python Hide the data in the image
information
Lose again to AI, Overtake quickly
technology
use Python Draw the cartoon image of Gu ailing
technology
Python Visualization is close to 90 Baidu search of days
Share
Point collection
A little bit of praise
Click to see
边栏推荐
- [illusory] automatic door blueprint notes
- Project practice, redis cluster technology learning (10)
- Data insertion in C language
- Bugkuctf-web16 (backup is a good habit)
- Project practice, redis cluster technology learning (VII)
- C language: making barrels
- The latest progress and development trend of 2022 intelligent voice technology
- Junit5 supports suite methods
- 2837xd 代碼生成——補充(1)
- Mysql索引
猜你喜欢
Skywalking theory and Practice
Unreal material editor foundation - how to connect a basic material
Ue5 - AI pursuit (blueprint, behavior tree)
2837xd代码生成模块学习(4)——idle_task、Simulink Coder
[Yu Yue education] University Physics (Electromagnetics) reference materials of Taizhou College of science and technology, Nanjing University of Technology
[ue5] animation redirection: how to import magic tower characters into the game
【UE5】蓝图制作简单地雷教程
Matlab generates DSP program -- official routine learning (6)
Leetcode -- the nearest common ancestor of 236 binary tree
2837xd代码生成模块学习(1)——GPIO模块
随机推荐
2837xd 代碼生成——補充(1)
Image recognition - data annotation
[leetcode] sword finger offer 53 - I. find the number I in the sorted array
Following nym, the new project Galaxy token announced by coinlist is gal
ESLint 报错
PI control of three-phase grid connected inverter - off grid mode
Configuration programmée du générateur de plantes du moteur illusoire UE - - Comment générer rapidement une grande forêt
UE4 night lighting notes
Share a blog (water blog)
The latest progress and development trend of 2022 intelligent voice technology
Project practice, redis cluster technology learning (11)
Alibaba cloud SMS service
Vs+qt set application icon
Skywalking theory and Practice
Beautiful and intelligent, Haval H6 supreme+ makes Yuanxiao travel safer
How to handle error logic gracefully
QT signal slot summary -connect function incorrect usage
Unreal material editor foundation - how to connect a basic material
[ue5] blueprint making simple mine tutorial
2837xd code generation module learning (1) -- GPIO module