当前位置:网站首页>How difficult is it to be high? AI rolls into the mathematics circle, and the accuracy rate of advanced mathematics examination is 81%!
How difficult is it to be high? AI rolls into the mathematics circle, and the accuracy rate of advanced mathematics examination is 81%!
2022-07-02 23:19:00 【AI technology base camp】

Arrangement | Hemu wood
Produce | AI Technology base (ID:rgznai100)
OpenAI Of Codex Already in MIT Of 7 The accuracy rate of subjects in advanced mathematics courses reaches 81.1%, Proper MIT Undergraduate level . Courses range from elementary calculus to differential equations 、 probability theory 、 Linear algebra has , The form of the question is in addition to calculation 、 There are even drawings .
High numbers are a nightmare for many science students ? Xiaobian was a person with poor high numbers at that time 
Then let AI How difficult is it to do a math problem ? Not to mention high numbers ?
yesterday , See such a hot search :

Is it more difficult to accept ?!!!
these years , Scientists have been trying to make AI Robot Challenge math exam , But I failed for years , Even as low as 20 Multipoint . therefore , Scientists generally believe that AI cannot challenge Advanced Mathematics . But recently , Scientists at MIT are based on OpenAI Codex The pre training model passes on high numbers few-shot learning The correct rate of 81%! Relevant research has been ArXiv Included . Courses range from elementary calculus to differential equations 、 probability theory 、 Linear algebra has , The form of the question is in addition to calculation 、 There are even drawings .

Language model Minerva
Researchers found that , Give Way AI There are many ways to solve mathematical problems .
First , Use the latest GPT-3 The language model can only achieve 18.8% The accuracy of ; Secondly, researchers try to use small sample learning and the latest thinking chain tips , The accuracy has risen to 30.8%; Last , Researchers fine tune the code , Use Codex A small amount of learning , Give Way AI Challenge MIT in six math courses 210 A question , The accuracy has been improved to 81.1% .
The solution of the research team is to do pre training on the text first , Fine tune with code , Transform mathematical problems into equivalent problems , By making AI Automatically generate supplementary context , After automatically generating the text suitable for the operation of the model , Then generate the corresponding code and run , Finally solve the mathematical problem . The next step of the research team is to expand this technology to more courses , And consider the practical application in teaching .
In this paper submitted , We learned that they have launched a language model Minerva, The model can solve mathematical and scientific problems , Let the model step by step . By collecting training data related to quantitative reasoning problems 、 Large scale training model , And use advanced reasoning technology , This research has achieved significant performance improvement in various difficult quantitative reasoning tasks .
Minerva Solve problems by generating solutions , Including numerical calculation 、 Symbol operation , Instead of relying on external tools such as calculators .Minerva Natural language and mathematical symbols can be combined to analyze and answer mathematical problems .
Besides ,Minerva It also combines a variety of technologies , Including small sample tips 、 Thinking chain 、 Register prompt and majority voting principle , Thus in STEM Reasoning task SOTA performance .
Minerva It can not only solve algebraic problems , It can also solve physics 、 number theory 、 The geometric 、 biological 、 chemical 、 Astronomy and many other problems .

Here is Minerva Solve geometric problems :

Application questions , You can list equations :

You can even deduce and prove .
In order to test Minerva Quantitative reasoning ability , Researchers are in different STEM It was evaluated on the benchmark , Covering courses ranging from elementary school level problems to graduate level . The researchers are still OCWCourses On the assessment Minerva, Covering from MIT OpenCourseWare Solid state chemistry collected in 、 Astronomy 、 Differential equations and special relativity STEM The theme .
It turns out that , After evaluation of all data sets ,5400 Billion parameter Minerva Achieve SOTA, Sometimes even a substantial increase .
however ,Minerva And made a lot of mistakes .
To better identify areas where the model can be improved , The researchers analyzed a sample of problems where the model went wrong , Most of the errors found are easy to explain . It turns out that , About half of them are calculation errors , The other half is reasoning error , The reason is that the solution steps do not follow the logical thinking chain .
meanwhile ,Minerva It is also possible to get the correct final answer , But the reasoning is still wrong . Analysis results show that , This probability is relatively low ,Minerva 62B stay MATH The average on the dataset is lower than 8%.

Conclusion
AI Not only in the technology circle has a good development , They also show their strength in different fields , There is a concession before AI Write a composition for college entrance examination , use AI Restore the precious picture of the PLA garrison in Hong Kong .
Not only students are looking forward to one day using AI do the homework , And teachers also expect to use AI Write a paper .
Some netizens also said , Want to challenge him .
What do you think ?
Reference link :
https://s.weibo.com/weibo/%2523AI%25E8%2580%2583%25E9%25AB%2598%25E6%2595%25B0%25E4%25BB%2585%25E5%25BE%259781%25E5%2588%2586%2523?topnav=1&wvr=6&Refer=top_hot&sudaref=weibo.com
https://ai.googleblog.com/2022/06/minerva-solving-quantitative-reasoning.html

Looking back
NLP Exploration and practice of class problem modeling scheme
Python Common encryption algorithms in crawlers !
2D Transformation 3D, Look at NVIDIA's AI“ new ” magic !
How to use Python Realize the security system of the scenic spot ?
Share
Point collection
A little bit of praise
Click to see 边栏推荐
- 内网渗透 | 手把手教你如何进行内网渗透
- Typical case of data annotation: how does jinglianwen technology help enterprises build data solutions
- Splunk audit setting
- FOC矢量控制及BLDC控制中的端电压、相电压、线电压等概念别还傻傻分不清楚
- (stinger) use pystinger Socks4 to go online and not go out of the network host
- [favorite poems] OK, song
- BBR 遭遇 CUBIC
- STM32串口DAM接收253字节就死机原因排查
- 为什么RTOS系统要使用MPU?
- Hisilicon VI access video process
猜你喜欢

情感对话识别与生成简述

Go language sqlx library operation SQLite3 database addition, deletion, modification and query

Win11麦克风测试在哪里?Win11测试麦克风的方法

Go basic anonymous variable

Xshell configuration xforward forwarding Firefox browser

PotPlayer设置最小化的快捷键

基于Pyqt5工具栏按钮可实现界面切换-2

Loss function~

Li Kou brush questions (2022-6-28)

密码技术---分组密码的模式
随机推荐
STM32串口DAM接收253字节就死机原因排查
Go language sqlx library operation SQLite3 database addition, deletion, modification and query
Which common ports should the server open
C#中Linq用法汇集
Jinglianwen technology's low price strategy helps AI enterprises reduce model training costs
Deep analysis of data storage in memory - C language
Submit code process
地平线2022年4月最新方案介绍
密码技术---分组密码的模式
設置單擊右鍵可以選擇用VS Code打開文件
STM32之ADC
Doorplate making C language
Win11如何开启目视控制?Win11开启目视控制的方法
The use of 8255 interface chip and ADC0809
Catalogue of digital image processing experiments
损失函数~
基于Pyqt5工具栏按钮可实现界面切换-2
深度剖析数据在内存中的存储----C语言篇
MarkDown基本语法
SQL advanced syntax