当前位置:网站首页>Overview of video self supervised learning
Overview of video self supervised learning
2022-07-05 18:40:00 【Zhiyuan community】
https://arxiv.org/abs/2207.00419
The remarkable success of deep learning in various fields depends on the availability of large-scale annotation data sets . However , Using artificially generated annotations will lead to biased learning of the model 、 Poor domain generalization ability and robustness . Getting comments is also expensive , It takes a lot of effort , This is particularly challenging for video . As an alternative , Self supervised learning provides a way to express learning without annotations , It shows prospects in the field of image and video . Different from image domain , Learning video presentation is more challenging , Because the time dimension , Introduced motion and other environmental dynamics . This also provides an opportunity for the exclusive idea of Promoting Self-regulated Learning in the field of video and multimodality . In this review , We provide an existing method for self supervised learning in the field of video . We summarize these methods into three different categories according to their learning objectives : 1) Text preset task ,2) Generative modeling , and 3) Comparative learning . These methods are also different in the way they are used ; 1) video, 2) video-audio, 3) video-text, 4) video-audio-text. We further introduce the commonly used data sets 、 Downstream assessment tasks 、 Limitations of existing work and potential future directions in this field .
The requirement of large-scale labeled samples limits the use of deep network in the problem of limited data and difficult annotation , For example, medical imaging Dargan et al. [2020]. Although in ImageNet Krizhevsky wait forsomeone [2012a] and Kinetics Kay wait forsomeone [2017] Pre training on large-scale labeled data sets can indeed improve performance , But this method has some defects , For example, note the cost Yang et al. [2017], Cai et al. [2021], Annotation deviation Chen and Joo [2021], Rodrigues and Pereira[2018], Lack of domain generalization Wang wait forsomeone [2021a], Hu wait forsomeone [2020],Kim wait forsomeone [2021], And lack of robustness Hendrycks and Dietterich[2019].Hendrycks etc. [2021]. Self supervised learning (SSL) It has become a successful method of pre training depth model , To overcome some of these problems . It is a promising alternative , Models can be trained on large data sets , You don't need to mark Jing and Tian[2020], And it has better generalization .SSL Use some learning objectives from the training sample itself to train the model . then , This pre trained model is used as the initialization of the target data set , Then fine tune it with the available marker samples . chart 1 Shows an overview of this approach .
边栏推荐
- 【在優麒麟上使用Electron開發桌面應】
- 兄弟组件进行传值(显示有先后顺序)
- The 2022 China Xinchuang Ecological Market Research and model selection evaluation report released that Huayun data was selected as the mainstream manufacturer of Xinchuang IT infrastructure!
- Problems encountered in the project u-parse component rendering problems
- Nacos distributed transactions Seata * * install JDK on Linux, mysql5.7 start Nacos configure ideal call interface coordination (nanny level detail tutorial)
- The main thread anr exception is caused by too many binder development threads
- Electron installation problems
- 让更多港澳青年了解南沙特色文创产品!“南沙麒麟”正式亮相
- AI表现越差,获得奖金越高?纽约大学博士拿出百万重金,悬赏让大模型表现差劲的任务
- 《ClickHouse原理解析与应用实践》读书笔记(5)
猜你喜欢
基于can总线的A2L文件解析(3)
使用JMeter录制脚本并调试
记录Pytorch中的eval()和no_grad()
解决 contents have differences only in line separators
2022年阿里Android高级面试题分享,2022阿里手淘Android面试题目
Solutions contents have differences only in line separators
The 2022 China Xinchuang Ecological Market Research and model selection evaluation report released that Huayun data was selected as the mainstream manufacturer of Xinchuang IT infrastructure!
中文版Postman?功能真心强大!
第十一届中国云计算标准和应用大会 | 云计算国家标准及白皮书系列发布 华云数据全面参与编制
How to automatically install pythn third-party libraries
随机推荐
Take a look at semaphore, the current limiting tool provided by JUC
Multithreading (I) processes and threads
Is it safe for Apple mobile phone to speculate in stocks? Is it a fraud to get new debts?
C language makes it easy to add, delete, modify and check the linked list "suggested collection"
@Extension、@SPI注解原理
Cronab log: how to record the output of my cron script
关于服装ERP,你想知道的都在这里了
爬虫01-爬虫基本原理讲解
Tupu software digital twin | visual management system based on BIM Technology
AI表现越差,获得奖金越高?纽约大学博士拿出百万重金,悬赏让大模型表现差劲的任务
Isprs2022/ cloud detection: cloud detection with boundary nets
Various pits of vs2017 QT
Is it safe to open an account and register stocks for stock speculation? Is there any risk? Is it reliable?
深入底层C源码讲透Redis核心设计原理
MySQL优化六个点的总结
IDEA配置npm启动
The easycvr platform reports an error "ID cannot be empty" through the interface editing channel. What is the reason?
写作写作写作写作
ViewPager + RecyclerView的内存泄漏
瞅一瞅JUC提供的限流工具Semaphore