当前位置:网站首页>Raki's notes on reading paper: memory replace with data compression for continuous learning
Raki's notes on reading paper: memory replace with data compression for continuous learning
2022-06-11 19:22:00 【Sleepy Raki】
Abstract & Introduction & Related Work
- Research tasks
Continuous learning - Existing methods and related work
- Facing the challenge
- The existing work is mainly based on a small memory buffer containing a small number of original data , This does not fully describe the old data distribution
- Existing work often requires training additional parameters or distilling old features
- Innovative ideas
- In this work , We propose memory replay with data compression , To reduce the storage cost of old training samples , This increases the number of buffers they can store in memory
- We propose a process based on the decisive point (DPPs) New method of , To effectively determine the appropriate compression quality of the currently arrived training samples , thus , Use a naive data compression algorithm with appropriate selection quality , More compressed data can be saved in limited storage space , So as to greatly improve the recent strong baseline
- The experimental conclusion
A lot of experiments show that , Use a simple data compression algorithm , And properly select the quality , More compressed data can be saved in the limited storage space , So as to greatly improve the memory playback
And DNN Medium " artificial " Memory playback is different , An important feature of biological memory is to encode and replay old experiences in a highly compressed form to overcome catastrophic forgetting

Data compression
Data compression aims to improve the storage efficiency of files , Including lossless compression and lossy compression . Lossless compression needs to perfectly reconstruct the original data from the compressed data , This limits its compression ratio (Shannon, 1948). by comparison , Lossy compression can achieve higher compression rate by reducing the original data , So it is widely used in practical applications . Typical handmade methods include JPEG( or JPG)(Wallace,1992), It is the most commonly used lossy compression algorithm (Mentzer wait forsomeone ,2020),WebP(Lian & Shilei,2012) and JPEG2000(Rabbani,2002). On the other hand , Neural compression methods usually rely on optimizing Shannon rate - Distortion tradeoffs , adopt RNNs(Toderici wait forsomeone ,2015;2017)、 Automatic encoder (Agustsson wait forsomeone ,2017) and GANs(Mentzer wait forsomeone ,2020) To achieve
METHOD
MEMORY REPLAY WITH DATA COMPRESSION
If the compression ratio is too high , It will affect the training effect , Intuitively speaking , This leads to a trade-off between quality and quantity : If storage space is limited , Reduce the quality of data compression q Will increase the amount of compressed data that can be stored in the memory buffer N q m b N_q^{mb} Nqmb, vice versa
ad locum , We use JPEG(Wallace, 1992) Compress the image to evaluate the proposed idea ,JPEG Is a simple but commonly used lossy compression algorithm .JPEG Can save quality in [1, 100] Images in range , Reducing the quality will result in smaller file sizes . The use is equivalent to each type 20 A memory buffer for the original image (Hou wait forsomeone ,2019 year ), We use a representative memory playback method , Such as LUCIR(Hou wait forsomeone ,2019 year ) and PODNet(Douillard wait forsomeone ,2020 year ) Yes JPEG Quality grid search . Pictured 2 Shown , The memory playback of compressed data with appropriate quality can greatly exceed the performance of the original data . However , Whether the mass is too large or too small , Will affect performance . especially , The quality of achieving optimal performance varies depending on the memory playback method . The methods of memory playback are different , But the segmentation is consistent for different number of incremental stages 
QUALITY-QUANTITY TRADE-OFF
q The smaller it is ( The higher the compression ) The larger the data set , That is to say, choose the right one q To reproduce the memory
use LUCIR Study 5 Stage ImageNet-sub after , Original subset ( Light dots ) And its compressed subset ( Dark dot ) Of t-SNE Visualization features . From left to right , The number from 37,85 Add to 200, and JPEG The quality of 90,50 Reduced to 10. We mapped the five categories in the latest task , And marked with different colors . The crossing area is distributed outside 
Choose the right one q To optimize the following maximum likelihood 
D q m b D_q^{mb} Dqmb The construction of can basically be seen as a sampling problem , We apply deterministic point processes (DPPs) To establish conditional possibilities ,DPPs Not just a beautiful probabilistic sampling model , It can describe the probability of each possible subset by determinant , It can also provide a geometric interpretation of probability by the capacity of all elements in the subset

Due to the high complexity , It is difficult to optimize directly 
Change to optimize M q ∗ M_q^* Mq∗


VALIDATE OUR METHOD WITH GRID SEARCH RESULTS

Now we use the grid search results to verify the quality determined by our method , among LUCIR and PODNet stay JPEG The quality is 50 and 75 Of ImageNet-sub We've got the best performance 
Because they are satisfying ∣ R q − 1 ∣ < ϵ |R_q - 1| < \epsilon ∣Rq−1∣<ϵ Minimum mass of . therefore , The quality determined by our method is consistent with the grid search results , But the calculation cost is saved 100 Many times . Interestingly , For each mass q q q, ∣ R − q − 1 ∣ < ϵ |R-q - 1| <\epsilon ∣R−q−1∣<ϵ Whether it is consistent in the average value of each incremental stage and all incremental stages . We're in the appendix D.4 Further discussion in R q R_q Rq A more dynamic situation
EXPERIMENT





CONCLUSION
In this work , We have put forward , Use data compression with the appropriate selected compression quality , More compressed data can be saved in limited storage space , Thus, the efficiency of memory playback is greatly improved . In order to effectively determine the compression quality , We provide a process based on the decisive point (DPPs) New method of , To avoid repeated training , Our method is verified in the class incremental learning and semi supervised continuous learning of object detection . Our work not only provides an important but underexplored baseline , It also opens up a promising new way for continuous learning . Further work can be done to develop an adaptive compression algorithm for incremental data to improve the compression rate , Or propose a new regularization method to constrain the distribution changes caused by data compression . meanwhile , be based on DPPs The theoretical analysis can be used as a general framework to integrate the optimizable variables in memory playback , For example, the strategy of selecting prototypes . Besides , Our work also proposes how to save a batch of training data in a limited storage space , To best describe its distribution , This will promote a wider application in the field of data compression and data selection
Remark
Some improvements have been achieved by data compression , But I can't understand a bunch of complicated formulas , It proves that compressing data to store more pictures for recall can obtain better performance , Not bad , If it could be simpler, I would like it very much paper
边栏推荐
- Use Mysql to determine the day of the week
- User group actions
- ASEMI的MOS管25N120在不同应用场景的表现
- [signal denoising] signal denoising based on FFT and fir with matlab code
- Leetcode: sword finger offer 56 - ii Number of occurrences of numbers in the array II [simple sort]
- How are functional components different from class components
- Flink CDC 在大健云仓的实践
- 疫情下远程办公沟通心得|社区征文
- Deploy a go MSSQL API endpoint on SAP kyma
- An adaptive chat site - anonymous online chat room PHP source code
猜你喜欢

Operator new and placement new
![[Multisim Simulation] using operational amplifier to generate sawtooth wave](/img/98/27086746dc552ada25fd36a82cb52b.png)
[Multisim Simulation] using operational amplifier to generate sawtooth wave

cf:G. Count the Trains【sortedset + bisect + 模拟维持严格递减序列】

【Multisim仿真】利用运算放大器产生方波、三角波发生器

Neural network and deep learning-2-simple example of machine learning pytorch

WWDC22 开发者需要关注的重点内容
![Cf:e. price maximization [sort + take mod + double pointer + pair]](/img/a0/410f06fa234739a9654517485ce7a3.png)
Cf:e. price maximization [sort + take mod + double pointer + pair]

"Case sharing" based on am57x+ artix-7 FPGA development board - detailed explanation of Pru Development Manual

更换目标检测的backbone(以Faster RCNN为例)

基于华为云图像识别标签实战
随机推荐
视觉SLAM十四讲笔记-10-1
[Multisim Simulation] generate square wave and triangular wave generators by operational amplifier
求数据库设计毕业信息管理
Leetcode: sword finger offer 59 - ii Maximum value of queue [deque + sortedlist]
Postman configuration Chinese
5g communication test manual based on Ti am5728 + artix-7 FPGA development board (dsp+arm)
Use Mysql to determine the day of the week
Introduction to go language (VI) -- loop statement
WWDC22 开发者需要关注的重点内容
Swagger2 easy to use
【图像去噪】基于马尔可夫随机场实现图像去噪附matlab代码
What is the workflow of dry goods MapReduce?
视觉SLAM十四讲笔记-10-2
Key contents that wwdc22 developers need to pay attention to
[image denoising] image denoising based on Markov random field with matlab code
On the translation of rich text storage database format
postman配置中文
基于华为云图像识别标签实战
CMU 15-445 数据库课程第五课文字版 - 缓冲池
cf:A. Print a Pedestal (Codeforces logo?)【简单遍历模拟】