当前位置：网站首页>AI scores 81 in high scores. Netizens: AI model can't avoid "internal examination"!

AI scores 81 in high scores. Netizens: AI model can't avoid "internal examination"!

2022-07-03 13:21:00 【CSDN information】

Arrangement | Hemu wood

Produce | AI Technology base （ID:rgznai100）

High numbers are a nightmare for many science students ？ Xiaobian was a person with poor high numbers at that time

Then let AI How difficult is it to do a math problem ？ Not to mention high numbers ？

not long ago , See such a hot search ：

Is it more difficult to accept ？！！！

these years , Scientists have been trying to make AI Robot Challenge math exam , But I failed for years , Even as low as 20 Multipoint . therefore , Scientists generally believe that AI cannot challenge Advanced Mathematics . But recently , Scientists at MIT are based on OpenAI Codex The pre training model passes on high numbers few-shot learning The correct rate of 81%！ Relevant research has been ArXiv Included . Courses range from elementary calculus to differential equations 、 probability theory 、 Linear algebra has , The form of the question is in addition to calculation 、 There are even drawings .

Language model Minerva

Researchers found that , Give Way AI There are many ways to solve mathematical problems .

First , Use the latest GPT-3 The language model can only achieve 18.8% The accuracy of ; Secondly, researchers try to use small sample learning and the latest thinking chain tips , The accuracy has risen to 30.8%; Last , Researchers fine tune the code , Use Codex A small amount of learning , Give Way AI Challenge MIT in six math courses 210 A question , The accuracy has been improved to 81.1% .

The solution of the research team is to do pre training on the text first , Fine tune with code , Transform mathematical problems into equivalent problems , By making AI Automatically generate supplementary context , After automatically generating the text suitable for the operation of the model , Then generate the corresponding code and run , Finally solve the mathematical problem . The next step of the research team is to expand this technology to more courses , And consider the practical application in teaching .

In this paper submitted , We learned that they have launched a language model Minerva, The model can solve mathematical and scientific problems , Let the model step by step . By collecting training data related to quantitative reasoning problems 、 Large scale training model , And use advanced reasoning technology , This research has achieved significant performance improvement in various difficult quantitative reasoning tasks .

Minerva Solve problems by generating solutions , Including numerical calculation 、 Symbol operation , Instead of relying on external tools such as calculators .Minerva Natural language and mathematical symbols can be combined to analyze and answer mathematical problems .

Besides ,Minerva It also combines a variety of technologies , Including small sample tips 、 Thinking chain 、 Register prompt and majority voting principle , Thus in STEM Reasoning task SOTA performance .

Minerva It can not only solve algebraic problems , It can also solve physics 、 number theory 、 The geometric 、 biological 、 chemical 、 Astronomy and many other problems .

Here is Minerva Solve geometric problems ：

Application questions , You can list equations ：

You can even deduce and prove .

In order to test Minerva Quantitative reasoning ability , Researchers are in different STEM It was evaluated on the benchmark , Covering courses ranging from elementary school level problems to graduate level . The researchers are still OCWCourses On the assessment Minerva, Covering from MIT OpenCourseWare Solid state chemistry collected in 、 Astronomy 、 Differential equations and special relativity STEM The theme .

It turns out that , After evaluation of all data sets ,5400 Billion parameter Minerva Achieve SOTA, Sometimes even a substantial increase .

however ,Minerva And made a lot of mistakes .

To better identify areas where the model can be improved , The researchers analyzed a sample of problems where the model went wrong , Most of the errors found are easy to explain . It turns out that , About half of them are calculation errors , The other half is reasoning error , The reason is that the solution steps do not follow the logical thinking chain .

meanwhile ,Minerva It is also possible to get the correct final answer , But the reasoning is still wrong . Analysis results show that , This probability is relatively low ,Minerva 62B stay MATH The average on the dataset is lower than 8%.

Conclusion

AI Not only in the technology circle has a good development , They also show their strength in different fields , There is a concession before AI stay 40 Second write 40 College entrance examination composition , use AI Repair many precious photos 、 The picture .

Not only students are looking forward to one day using AI do the homework , And teachers also expect to use AI Write a paper .

Some netizens also said , Want to challenge him .

What do you think ？

Reference link ：

https://s.weibo.com/weibo/%2523AI%25E8%2580%2583%25E9%25AB%2598%25E6%2595%25B0%25E4%25BB%2585%25E5%25BE%259781%25E5%2588%2586%2523?topnav=1&wvr=6&Refer=top_hot&sudaref=weibo.com

https://ai.googleblog.com/2022/06/minerva-solving-quantitative-reasoning.html

— Recommended reading —

*7-Zip  Boycotted ？ The caller decided “ Three sins ”： Pseudo open source 、 unsafe 、 The author is from Russia ！
*“ give up  GitHub , The time has come. ”, Software freedom protection association angrily criticized ！
* Microsoft banned , Russia is against piracy  Windows  Demand for new products “ Skyrocketing ”！

原网站

版权声明
本文为[CSDN information]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/184/202207031239451010.html

当前位置：网站首页>AI scores 81 in high scores. Netizens: AI model can't avoid "internal examination"!

AI scores 81 in high scores. Netizens: AI model can't avoid "internal examination"!

边栏推荐

猜你喜欢

随机推荐