当前位置:网站首页>Self supervised learning (SSL)
Self supervised learning (SSL)
2022-06-23 17:31:00 【luemeon】
Catalog
supervise 、 Unsupervised 、 Self supervised learning
supervise 、 Unsupervised 、 Self supervised learning
The main difference between supervised and unsupervised is whether the model needs Manual marking Of Tag information .
The main purpose of self supervised learning is to learn richer semantic representations .

Assess the ability of self supervised learning , Mainly through Pretrain-Fintune The pattern of .
Supervised Pretrain - Finetune technological process :
1. From a lot of Tagged data Training on , Get the pre training model ,
2. For new downstream tasks (Downstream task), The parameters we will learn ( For example, the parameters of the layer before the output layer ) Migration , On new tagged tasks 「 fine-tuning 」, So we can get a network that can adapt to the new task .
Self supervised Pretrain - Finetune technological process :
1. From a large number of Unlabeled data Pass through pretext Training network ( Automatically construct supervision information in data ), Get the pre training model
2. For new downstream tasks , Like supervised learning , Transfer the learned parameters and fine tune it .
Therefore, the ability of self supervised learning is mainly reflected by the performance of downstream tasks .
supervised learning Characteristics :
- For every picture , The machine predicts a category Or is it bounding box
- Training data are manually labeled
- Each sample can only provide very little information ( such as 1024 individual categories Only 10 bits Information about )
self-supervised learning Characteristics :
- For a picture , The machine can predict any part ( Automatically build supervision signals )
- For video , Can predict future frames
- Each sample can provide a lot of information
Self supervised learning
The core idea
Self-Supervised Learning First, the parameters are trained from a blank sheet of paper to the initial shape , Then from preliminary shaping training to complete shaping .1. Training to the initial shape of things , Visual Representation.
2. Then according to the downstream tasks (Downstream Tasks) Different to use Tagged datasets Train the parameters to full shape ,
Then the amount of data set used at this time is not too much , Because the parameter has passed the second step 1 The training stage is almost the same .
The first phase does not involve any downstream tasks , Is to pre train with a pile of unlabeled data , There are no specific tasks , This word is expressed in the official language called :in a task-agnostic way.
The second phase involves downstream tasks , Is to take a pile of labeled data to the downstream task Fine-tune, This word is expressed in the official language called :in a task-specific way.

In the field
Self-Supervised Learning Not just in NLP field , stay CV, There are also many classic works in the field of voice , It can be divided into 3 class :Data Centric, Prediction ( Also called Generative) and Contrastive.

The main method
1. Based on context (Context based)
2. Based on timing (Temporal Based)
3. Based on contrast (Contrastive Based)
Mainstream classification
generative methods
Can rebuild => Can extract good feature expression
eg:MAE、BERT
contrastive methods
Distinguish different inputs in the feature space

Reference link :
边栏推荐
- Shushulang passed the listing hearing: the gross profit margin of the tablet business fell, and the profit in 2021 fell by 11% year-on-year
- [today in history] June 23: Turing's birthday; The birth of the founder of the Internet; Reddit goes online
- Tupu digital twin 3D wind farm, offshore wind power of smart wind power
- Spdlog logging example - create a logger using sink
- Talk about the difference between redis cache penetration and cache breakdown, and the avalanche effect caused by them
- Jmeter压力测试教程
- How to make sales management more efficient?
- How to choose an account opening broker? Is it safe to open an account online now?
- Réponse 02: pourquoi le cercle Smith peut - il "se sentir haut et bas et se tenir à droite et à droite"?
- Hapoxy cluster service setup
猜你喜欢

The evolution of social structure and capital system brought about by the yuan universe

Online communication - the combination of machine learning and knowledge reasoning in trusted machine learning (Qing Yuan talk, issue 20, Li Bo)

接口的所有权之争

Another breakthrough! Alibaba cloud enters the Gartner cloud AI developer service Challenger quadrant

美团三面:聊聊你理解的Redis主从复制原理?

【网络通信 -- WebRTC】WebRTC 源码分析 -- 接收端带宽估计

DataNode进入Stale状态问题排查

Intranet penetration token stealing

公司招了个五年经验的测试员,见识到了真正的测试天花板

Mathematical analysis_ Certification_ Chapter 1: the union of countable sets is countable
随机推荐
Here comes the official zero foundation introduction jetpack compose Chinese course!
How can the points mall make profits
How to use SQL window functions
Hands on data analysis unit 2 section 4 data visualization
Look, this is the principle analysis of modulation and demodulation! Simulation documents attached
网络远程访问树莓派(VNC Viewer)
Database Experiment 2 query
Elk log collection system deployment
C. Add One--Divide by Zero 2021 and Codeforces Round #714 (Div. 2)
解答02:Smith圓為什麼能“上感下容 左串右並”?
Answer 03: why can Smith circle "allow left string and right parallel"?
[untitled] Application of laser welding in medical treatment
Opengauss database source code analysis series articles -- detailed explanation of dense equivalent query technology (Part 1)
Apache foundation officially announced Apache inlong as a top-level project
B. AND 0, Sum Big-Codeforces Round #716 (Div. 2)
Right leg drive circuit principle? ECG acquisition is a must, with simulation files!
Tensorrt Paser loading onnx inference use
ABP framework - data access infrastructure (Part 2)
时间戳90K是什么意思?
Digital twin excavator of Tupu software realizes remote control