当前位置:网站首页>Overview of video self supervised learning
Overview of video self supervised learning
2022-07-05 18:40:00 【Zhiyuan community】
https://arxiv.org/abs/2207.00419
The remarkable success of deep learning in various fields depends on the availability of large-scale annotation data sets . However , Using artificially generated annotations will lead to biased learning of the model 、 Poor domain generalization ability and robustness . Getting comments is also expensive , It takes a lot of effort , This is particularly challenging for video . As an alternative , Self supervised learning provides a way to express learning without annotations , It shows prospects in the field of image and video . Different from image domain , Learning video presentation is more challenging , Because the time dimension , Introduced motion and other environmental dynamics . This also provides an opportunity for the exclusive idea of Promoting Self-regulated Learning in the field of video and multimodality . In this review , We provide an existing method for self supervised learning in the field of video . We summarize these methods into three different categories according to their learning objectives : 1) Text preset task ,2) Generative modeling , and 3) Comparative learning . These methods are also different in the way they are used ; 1) video, 2) video-audio, 3) video-text, 4) video-audio-text. We further introduce the commonly used data sets 、 Downstream assessment tasks 、 Limitations of existing work and potential future directions in this field .
The requirement of large-scale labeled samples limits the use of deep network in the problem of limited data and difficult annotation , For example, medical imaging Dargan et al. [2020]. Although in ImageNet Krizhevsky wait forsomeone [2012a] and Kinetics Kay wait forsomeone [2017] Pre training on large-scale labeled data sets can indeed improve performance , But this method has some defects , For example, note the cost Yang et al. [2017], Cai et al. [2021], Annotation deviation Chen and Joo [2021], Rodrigues and Pereira[2018], Lack of domain generalization Wang wait forsomeone [2021a], Hu wait forsomeone [2020],Kim wait forsomeone [2021], And lack of robustness Hendrycks and Dietterich[2019].Hendrycks etc. [2021]. Self supervised learning (SSL) It has become a successful method of pre training depth model , To overcome some of these problems . It is a promising alternative , Models can be trained on large data sets , You don't need to mark Jing and Tian[2020], And it has better generalization .SSL Use some learning objectives from the training sample itself to train the model . then , This pre trained model is used as the initialization of the target data set , Then fine tune it with the available marker samples . chart 1 Shows an overview of this approach .
边栏推荐
- 常见时间复杂度
- 2022最新中高级Android面试题目,【原理+实战+视频+源码】
- Introduction to the development function of Hanlin Youshang system of Hansheng Youpin app
- 爱因斯坦求和einsum
- 中文版Postman?功能真心强大!
- The 11th China cloud computing standards and Applications Conference | China cloud data has become the deputy leader unit of the cloud migration special group of the cloud computing standards working
- MYSQL中 find_in_set() 函数用法详解
- @Extension, @spi annotation principle
- 蚂蚁集团开源可信隐私计算框架「隐语」:开放、通用
- 7-2 保持链表有序
猜你喜欢
技术分享 | 接口测试价值与体系
websocket 工具的使用
第十一届中国云计算标准和应用大会 | 云计算国家标准及白皮书系列发布 华云数据全面参与编制
How to obtain the coordinates of the aircraft passing through both ends of the radar
Reading notes of Clickhouse principle analysis and Application Practice (5)
vulnhub之darkhole_2
LeetCode 6109. Number of people who know the secret
爬虫01-爬虫基本原理讲解
About statistical power
Idea configuring NPM startup
随机推荐
Is it safe to open an account and register stocks for stock speculation? Is there any risk? Is it reliable?
输油管的布置数学建模matlab,输油管布置的数学模型
ROS安装报错 sudo: rosdep:找不到命令
MYSQL中 find_in_set() 函数用法详解
The origin of PTS, DTS and duration of audio and video packages
Oracle日期格式转换 to_date,to_char,to_timetamp 相互转换
AI表现越差,获得奖金越高?纽约大学博士拿出百万重金,悬赏让大模型表现差劲的任务
@Extension、@SPI注解原理
Memory leak of viewpager + recyclerview
sample_ What is the relationship between rate, sample and duration
写作写作写作写作
Use of print function in MATLAB
7-2 keep the linked list in order
Quickly generate IPA package
lombok @Builder注解
爱因斯坦求和einsum
jdbc读大量数据导致内存溢出
AI金榜题名时,MLPerf榜单的份量究竟有多重?
兄弟组件进行传值(显示有先后顺序)
2022年阿里Android高级面试题分享,2022阿里手淘Android面试题目