当前位置:网站首页>[lecture notes] how to do in-depth learning in poor data?
[lecture notes] how to do in-depth learning in poor data?
2022-07-29 07:49:00 【Have you studied hard today】
intro: The results of deep learning are not only the functions of powerful models , It is also because there is a large amount of high-quality data to support . But when the data available for training is poor 、 What to do when there are various problems ?
This lecture introduces several imperfect data situations , For example, federal learning 、 Long tail learning 、 Noise label learning 、 Continuous learning, etc , And introduce how to make the deep learning method deal with these situations , Still strong .
Success of Deep Learning
What is a good data set ?
Large-scaled labeled data
Good training data should have the following traits:
- Accessible
- Large-scaled
- Balanced
- Clean
If your data does not meet the above Perfect data set characteristic , How to make deep learning still effective ?
- Data is locally stored ( When the data is not in hand , How to use others' data to train your own model ): Federal learning ,
Federated Learning
- Class distribution is imbalanced ( When data categories are unbalanced ): Long tail learning ,
Long-tail Learning
- Label is not accurate( When the data is dirty ):
Noisy Label Learning
- Partial data is available( When data is only partially available ): Continuous learning ,
Continual Learning
Federal learning
Federated Learning Framework
Federal learning , No data transmission , Transfer model parameters .
- Applicable scenario : Many data are not available , It's private data .
- Federal learning consists of Google On 2016 in . The goal is learn a model without centralized training.
- Data is stored privately in each client .
- Models are trained separately , And aggregate on the server .
- We send model parameters, other than data.
Main difficulties :Data heterogeneity ( Data heterogeneity )
- Number of training data on each client is different.
- Classes for training on each client is different.
- Imbalance ratio on each client is different.
Other difficulties :
- personalized FL, Personalized federal learning
- Communication and Compression, Need transmission and compression
- Preserving Privacy, privacy protection ( Some can deduce data through models )
- Fairness, fair ( The model should be as good )
- Data Poisoning Attacks, Data poisoning attack ( Someone wants to destroy the model with bad data )
- Incentive, Reward mechanism ( Someone wants to go whoring for nothing , The contribution to the model needs to be quantified )
- Vertical Federated Learning
- …
Long tail learning
- Applicable scenario : Category imbalance .
The amount of data in one category is much larger than that in another .
majority class & minority class
Aiming at the problem of data imbalance , Before the popularity of deep learning , There are mainly two kinds of commonly used methods :
- Resampling ,re-sampling method( Make the data more balanced )
- Reweighting ,re-weighting method ( For example, make a certain kind of punishment more serious )
In the face of deep learning problems , There are new challenges :
- Classified tasks More categories 了 , For example, there are thousands of categories . The imbalance problem becomes very complicated , For example, half a lot, half a little ? There are still many classes , Few classes ?
- Most deep learning models are End to end Of .
therefore , stay 2019 In, a new concept was put forward : Long tail learning .
Compared with traditional unbalanced learning , Long tail learning has the following characteristics :
- Many categories
- The number of samples in each category is subject to power-law distribution
- focus on deep learning models(most for CV tasks)
Method Methodology:
- Re-weighting, If there are fewer classes, they are classified incorrectly , Give more serious punishment
- Augmentation, Data to enhance
- Decoupling,RS or RW may damage feature representation. They only help build the classifier.
- Ensemble Learning, Integrated learning , Train multiple models and vote
Noise label learning
Applicable scenario : There is a certain error rate in labels .
Method :
Image source: B. Han et al., “A Survey of label-noise Representation Learning: Past, Present and Future”, 2020.
for example :
Estimate the noise transfer matrix , That is to estimate the probability that a certain kind of samples will be divided into another kind .
Co-Teaching:
Future Direction: OOD Noise
Clean, ID noise, OOD Noise(out of distribution)
Continuous learning
( Lifelong learning 、 Incremental learning 、 Data flow learning )
Data comes as time goes on.
The problem is :
- Limited memory , Previous samples will be discarded
- Data distribution may change
- The past cannot be forgotten
trade-off: The model should be stable , But also plastic .stability & plasticity
The plasticity of deep learning model is relatively easy , But it's also easy to forget what you learned before , This phenomenon is Catastrophic oblivion (catastrophic forgetting).
Replay methods:
Select some representative ones from the old samples , Add it to the new training .
Select and keep a few representative samples in each task. Incorporate them into the training process of future tasks.
How to use it? ? for example GEM, Added restrictions , The performance of the new model on the old sample cannot deteriorate .
How to choose ? For example, data set compression ,Dataset Condensation.
Playback model shortcoming :
- Cannot meet the requirements of lifelong learning , Always throw away a lot of data .
- Some data cannot be saved .
but SOTA Our approach is still based on Dataset Condensation Of .
regularization-based method A regularization based approach
Such methods do not store past data , You can save the model . In the process of optimization , It is required that the new model should not differ too much from the old model .
elastic weight consolidation
parameter isolation methods Parameter isolation method
Specify different model parameters for each task , To prevent possible forgetting .dedicate different model parameters to each task, to prevent any possible forgetting.
Generally, the important parameters of the past tasks are fixed.
The model is very large , Not all parameters are useful , So large models can be compressed into small models , Maintain its function . So after every study , Compress the model , Next time, use the empty parameter space to learn the next task .
Conclusion
The above discusses four kinds of imperfect data in the training of deep learning model :
- federated learning: data is not centralized.
- Long-tail learning: data is class imbalanced.
- Noisy label learning: data is mislabeled.
- Continual learning: data is gradually coming.
Reference:
Xiamen University Lu Yang Lectures on the frontier of information technology
A u t h o r : C h i e r Author: Chier Author:Chier
边栏推荐
- What are the principles and methods of implementing functional automation testing?
- Gin abort cannot prevent subsequent code problems
- Autojs微信研究:微信自动发送信息机器人最终成品(有效果演示)
- Use custom annotations to verify the size of the list
- 工业互联网行至深水区,落地的路要怎么走?
- Measured waveform of boot capacitor short circuit and open circuit of buck circuit
- UPC little C's King Canyon
- Postman interface test | JS script blocking sleep and non blocking sleep
- Getting started with JDBC
- postman接口测试|js脚本之阻塞休眠和非阻塞休眠
猜你喜欢
2022 Shenzhen Cup Title A: get rid of "scream effect" and "echo room effect" and get out of the "information cocoon room"
The smallest positive number that a subset of an array cannot accumulate
Android interview question | how to write a good and fast log library?
【深度学习】数据准备-pytorch自定义图像分割类数据集加载
RoBERTa:A Robustly Optimized BERT Pretraining Approach
The new colleague wrote a few pieces of code, broke the system, and was blasted by the boss!
LANDSCAPE
2022年深圳杯A题破除“尖叫效应”与“回声室效应”走出“信息茧房”
监听页面滚动位置定位底部按钮(包含页面初始化定位不对鼠标滑动生效的解决方案)
技术分享| 快对讲综合调度系统
随机推荐
黑盒测试常见错误类型说明及解决方法有哪些?
Technology sharing | quick intercom integrated dispatching system
Strongly connected component
[paper reading | cryoet] gum net: fast and accurate 3D subtomo image alignment and average unsupervised geometric matching
Basic introduction to pod
Pytorch's skill record
Jump from mapper interface to mapping file XML in idea
JVM garbage collection mechanism (GC)
[skill accumulation] common expressions when writing emails
[WPF] realize language switching through dynamic / static resources
207. Curriculum
cs61abc分享会(六)程序的输入输出详解 - 标准输入输出,文件,设备,EOF,命令行参数
零数科技深度参与信通院隐私计算金融场景标准制定
[summer daily question] Luogu p6461 [coci2006-2007 5] trik
[flask introduction series] installation and configuration of flask Sqlalchemy
For the application challenge of smart city, shengteng AI gives a new solution
Meeting notice of OA project (Query & whether to attend the meeting & feedback details)
[FPGA tutorial case 42] image case 2 - realize image binarization processing through Verilog, and conduct auxiliary verification through MATLAB
@Detailed explanation of requestmapping usage
Sqlmap (SQL injection automation tool)