当前位置:网站首页>What key progress has been made in deep learning in 2021?
What key progress has been made in deep learning in 2021?
2022-07-06 20:35:00 【woshicver】
link :https://www.zhihu.com/question/504050716
edit : Deep learning and computer vision
Statement : Just for academic sharing , Invasion and deletion
author : Astral evil
https://www.zhihu.com/question/504050716/answer/2280529580
Dare not comment on , It's just personal opinion ,GitHub Someone summed up 2021 Amazing AI papers, I think it's more pertinent , Basically, it can be regarded as a collection of papers with great influence this year :https://github.com/louisfb01/best_AI_papers_2021
I think one of them is Transformer Occupy all fields , In especial Swin Transformer Kill the Quartet ; The second is that of major research institutions . Pre training large model release and its amazing performance in downstream tasks , Of course, this is also inseparable self-supervised+transformer; The third is what everyone mentioned MAE, Of course, it is still inseparable transformer; Another important thing I think is based on NeRF A series of work also began to break out this year , Include CVPR best paper GIRAFFE, However, this work is mainly concentrated in foreign research teams
author :Riser
https://www.zhihu.com/question/504050716/answer/2285962009
I just saw Andrew Ng The teacher's “ A gift of rose , Fragrance in hand ” Christmas message , And review the 2021 year AI Community development , And prospects for the future development of the community .
Link to the original text :https://read.deeplearning.ai/the-batch/issue-123/
Teacher Wu Enda mainly talked about : Multimodal AI Take off of , A large model with trillions of parameters ,Transformer framework , And teacher Enda —AI Generate audio content , Artificial intelligence related laws have been promulgated , The first three topics are also my concerns , Combined with teacher Enda's talk Express a little bit of your understanding .
Personally feel Open AI Of CLIP Is absolutely 2021 Multimodal AI The outstanding representative of , The image classification task is modeled as image text matching , Using a large amount of text information on the Internet to supervise image tasks , Feeling “ Text + Images ”, even to the extent that “ Text + Images + Knowledge map " It is a line with good future prospects , There are a lot of lab Research on this area has begun . in addition Open AI Of Dall·E( Generate the corresponding image according to the input text ),DeepMind Of Perceiver IO ( Text 、 Images 、 Classify videos and point clouds ), Stanford University ConVIRT( For medicine X Add text labels to radiographic images ) It is also a good beginning for this topic .
Obviously, the past year , The model has experienced a development process from larger to larger .
From Google parameters 1.6 Trillions of Switch Transformer, Go to Beijing Artificial Intelligence Research Institute 1.75 Trillion enlightenment 2.0, Refresh the model level online again and again , Regardless of the magnitude of the model , Their original motivation and Bert It's all the same , Provide more for many downstream tasks general Better language pre training model , Maybe this “general learning” Our thoughts will also be transferred to CV field ( in fact , Many tasks we do will also migrate imagenet Pre training model of ), Higher level general CV model Maybe we need to think about the characteristics of image data format and self-monitoring training mode .
The other is Transformer At the top of each major visual and machine learning will kill crazy ,Swin Transformer trample VIT,Detr And many other visual Transformer Take it off the shoulder of the precursor ICCV2021 best paper, Proved Transformer Applicability in visual tasks ,Transformer In audio text and other sequence tasks, it has been basically proved that RNN My life , And this year , We see Transformer Start to challenge CNN In the supremacy of visual tasks , Of course, the organic integration of the two is also a hot and promising point at present .DeepMind Released AlphaFold 2 Open source version of , Its use transformers Prediction of protein composition based on amino acid sequence 3D structure , Shocked the medical community , He has made outstanding contributions to the field of human natural biology . All this proves Transformer It has good universality , We also look forward to more and more superior model architectures , Solve more problems .
Another thing that cannot be ignored is based on nerf(Neural Radiance Fields) The explosion of a series of work , It almost dominates many topics such as 3D reconstruction , Strictly speaking nerf yes 2020 Years of work , I always feel that I didn't get it in that year ECCV Of best paper unfortunately ( Of course Raft It's also very strong ..), however GIRAFFE Take this year's CVPR2021 best paper It also makes up for this regret .
All in all ,2021 Many years AI The research is still exciting , Let's look forward to and experience 2022 AI The development of !!!
author : Anonymous users
https://www.zhihu.com/question/504050716/answer/2280944226
Theoretically, I feel like I'm watering . The only work that may be considered as critical progress may be Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent; Frei & Gu 2021. This article is the epitome of deep learning optimization theory .
author : Grey pupil sextant https://www.zhihu.com/question/504050716/answer/2280495756
My areas of concern are relatively small , There is nothing very amazing
In hot work ,MAE It's really interesting , But I always feel that there is still no NLP Use in mask It's a natural
Always feel CV The future of self-monitoring pre training is not as close as you think , The faint feeling will be related to the three-dimensional reconstruction of the two-dimensional picture
author : Anonymous users
https://www.zhihu.com/question/504050716/answer/2279821079
The key thing in my heart is clip… I think clip Than vit Be interesting . Of course vit It also opens a very important direction ,rethinking architecture for vision tasks
dalle It's a very impressive Of work.gan There are a lot of , such as styleganv3 and gaugan2.nerf Of followup There are a lot of .
besides , also ssl Well , But I don't think it's essential breakthrough... even mae It just proves the previous self reconstruction Yes vit Of backbone Very effective
author : Cat eat fish
https://www.zhihu.com/question/504050716/answer/2279784861
Seeing this problem, the first thing that comes to mind may be this year ICCV Of best paper:swin transformer 了 . This paper is also for the current transformer stay CV Hot in the field ViT(Vision Transformer) An inheritance of .
Including this year transformer At the top of computer vision CVPR and ICCV Application on , be used transformerz It's a big part of it , You can see in the CV Domain use transformer It will be an upsurge . and Swin Transformer It is also the peak work , Currently in CV The field should have no effect more than Swin Transformer The structure of .
So I think Swin Transformer It can be said to be the key progress in the field of in-depth learning this year .
* END *
If you see this , Show that you like this article , Please forward 、 give the thumbs-up . WeChat search 「uncle_pn」, Welcome to add Xiaobian wechat 「 woshicver」, Update a high-quality blog post in your circle of friends every day .
↓ Scan QR code and add small code ↓

边栏推荐
- 知识图谱构建流程步骤详解
- Core principles of video games
- Event center parameter transfer, peer component value transfer method, brother component value transfer
- use. Net analysis Net talent challenge participation
- Use of OLED screen
- Anaconda安裝後Jupyter launch 沒反應&網頁打開運行沒執行
- BeagleBoneBlack 上手记
- Activiti global process monitors activitieventlistener to monitor different types of events, which is very convenient without configuring task monitoring in acitivit
- 【DSP】【第二篇】了解C6678和创建工程
- The mail command is used in combination with the pipeline command statement
猜你喜欢

Number of schemes from the upper left corner to the lower right corner of the chessboard (2)

Le lancement du jupyter ne répond pas après l'installation d'Anaconda

Special topic of rotor position estimation of permanent magnet synchronous motor -- fundamental wave model and rotor position angle

Notes on beagleboneblack

Node.js: express + MySQL实现注册登录,身份认证

知识图谱之实体对齐二

Enumeration gets values based on parameters

数字三角形模型 AcWing 1018. 最低通行费

持续测试(CT)实战经验分享

2022 nurse (primary) examination questions and new nurse (primary) examination questions
随机推荐
Linear distance between two points of cesium
"Penalty kick" games
[DSP] [Part 2] understand c6678 and create project
Tencent cloud database public cloud market ranks top 2!
Special topic of rotor position estimation of permanent magnet synchronous motor -- Summary of position estimation of fundamental wave model
Rhcsa Road
In line elements are transformed into block level elements, and display transformation and implicit transformation
解剖生理学复习题·VIII血液系统
Ideas and methods of system and application monitoring
[DIY]如何制作一款个性的收音机
[diy] how to make a personalized radio
[diy] self designed Microsoft makecode arcade, official open source software and hardware
Boder radius has four values, and boder radius exceeds four values
[network planning] Chapter 3 data link layer (4) LAN, Ethernet, WLAN, VLAN
持续测试(CT)实战经验分享
Rhcsa Road
Review questions of anatomy and physiology · VIII blood system
Error analysis ~csdn rebound shell error
[weekly pit] calculate the sum of primes within 100 + [answer] output triangle
Comment faire une radio personnalisée