当前位置:网站首页>Small sample learning data set
Small sample learning data set
2022-06-25 05:03:00 【MondayCat111】
Article reprinted from :https://blog.csdn.net/qq_36104364/article/details/107508592
This paper sorts out the small sample data sets commonly used in recent years , Provides an introduction to datasets , References and download addresses . All the resources I have have have been uploaded to Baidu cloud disk , Other datasets also provide official download addresses ( Some may need to climb over the wall ). Finally, a simple summary of each data set is made .
1.Omniglot
Omniglot Data sets are generated from 50 In different languages 1,623 Composed of handwritten characters , Every character has 20 Different handwriting , This constitutes a very large number of sample categories (1623 Kind of ), But the number of samples in each category is very small (20 individual ) Small sample handwritten character data set . In use, we usually choose 1200 Characters as training set , remainder 423 Characters as a verification set , And by rotating 90°,180° and 270° Data set expansion , Each picture will be cut to uniform size 28*28.
reference :Lake B, Salakhutdinov R, Gross J, et al. One shot learning of simple visual concepts[C]//Proceedings of the annual meeting of the cognitive science society. 2011, 33(33).
Download address :https://pan.baidu.com/s/19Y5aGfa-lNEZTDUeL1jP4g
Extraction code :4y3z
2. miniImageNet
miniImageNet Data sets are from ImageNet In the data set 60,000 Of images , common 100 Categories , Each category has 600 Zhang image , The size of each image is 84*84. One of them is usually selected in use 80 Images of categories are used as training sets , remainder 20 Images of categories are used as validation sets . Some articles divide it into basic sets (Base Class,64 Kind of ), Verification set (Validation Class,16 Kind of ) And new category sets (Novel Class,20 Kind of ).
reference :Vinyals O, Blundell C, Lillicrap T, et al. Matching networks for one shot learning[C]//Advances in neural information processing systems. 2016: 3630-3638.
Download address :https://pan.baidu.com/s/1nqBSA1w5mQuhlrQeCY4HgA
Extraction code :ajrz
3. tieredImageNet
tieredImageNet Data sets are also from ImageNet Selected in the dataset , contain 34 Two categories: (Categories), Each major class contains 10-30 A small class (Classes), Each category has a number of different image samples , total 608 Categories ,779,165 Zhang image ( On average, each category contains 1281 A picture ).34 These categories can be divided into training sets (20 Categories: ), Verification set (6 Categories: ) And test set (8 Categories: ), The data set division is shown in the following figure .
reference :Ren M, Triantafillou E, Ravi S, et al. Meta-learning for semi-supervised few-shot classification[J]. arXiv preprint arXiv:1803.00676, 2018.
Download address :
https://drive.google.com/uc?export=download&confirm=_SLS&id=1g1aIDy2Ar_MViF2gDXFYDBTR-HYecV07
4. CUB-200
CUB-200 The full name of the dataset is Caltech-UCSD Birds-200-2011 Data sets , Is a database of birds provided by the California Institute of technology , contain 200 Of birds 11,788 Zhang image . In use, it is usually divided into training sets (100 Kind of ), Verification set (50 Kind of ) And test set (50 Kind of ), The image size is uniformly cut to 84*84.
reference :Catherine Wah, Steve Branson, Peter Welinder, Pietro Perona, and Serge Belongie. The caltech-ucsd birds- 200-2011 dataset. 2011.
Download address :https://pan.baidu.com/s/1DEmLxePvDuJX1goSzM9r6Q
Extraction code :f1l5
5. CIFAR-FS
CIFAR-FS The full name of the dataset is CIFAR100 Few-Shots Data sets , It comes from CIFAR 100 Data sets , contain 100 Category , Each category 600 Zhang image , total 60,000 Zhang image . In use, it is usually divided into training sets (64 Kind of ), Verification set (16 Kind of ) And test set (20 Kind of ), The image size is unified as 32*32.
reference :Bertinetto L, Henriques J F, Torr P H S, et al. Meta-learning with differentiable closed-form solvers[J]. arXiv preprint arXiv:1805.08136, 2018.
Download address :https://pan.baidu.com/s/1HqRUw3dmsMBInt_Fh3J_Uw
Extraction code :ub38
6. ImageNet-1K Challenge
ImageNet-1K Challenge Data sets are also from ImageNet Data sets , Yes inclusive 1000 Category . In use, it is usually divided into basic data sets (389 Categories ) And new sample datasets (611 Kind of ).
reference :Hariharan B, Girshick R. Low-shot visual recognition by shrinking and hallucinating features[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 3018-3027.
Download address :http://www.image-net.org/
7. FC100
FC100 The full name of the dataset is Few-shot CIFAR100 Data sets , With the above CIFAR-FS Data sets are similar to , Also from CIFAR100 Data sets , contain 100 Category , Each category 600 Zhang image , total 60,000 Zhang image . But the difference is FC100 Not by category (Class) Divided , But according to superclass (Superclass) Divided . contain 20 A superclass (60 Categories ), One of the training sets 12 A superclass , Verification set 4 A superclass (20 Categories ), Test set 4 A superclass (20 Categories ).
reference :Oreshkin B, López P R, Lacoste A. Tadam: Task dependent adaptive metric for improved few-shot learning[C]//Advances in Neural Information Processing Systems. 2018: 721-731.
Download address :https://pan.baidu.com/s/1Wnlp1-obKsMLcHITYQ1CLg
Extraction code :kcd6
Summary table of small sample data set
| Small sample data set | source | Number of categories | Number of pictures | Picture size |
|---|---|---|---|---|
| Omniglot | - | 1623 | 32,460 | 28*28 |
| miniImageNet | ImageNet | 100 | 60,000 | 84*84 |
| tieredImageNet | ImageNet | 608 | 779,165 | 84*84 |
| ImageNet 1K | ImageNet | 1000 | - | - |
| CIFAR-FS | CIFAR 100 | 100 | 60,000 | 32*32 |
| FC100 | CIFAR 100 | 100 | 60,000 | 32*32 |
| CUB-200 | - | 200 | 11,788 | 84*84 |
8.FewRel Data sets
Relation extraction data set released by Tsinghua University RewRel, The dataset contains 100 individual Relation,44800 individual Instance( The sentence ), Belongs to a supervised data set .
Download address :https://thunlp.github.io/fewrel.html
GitHub Address :https://github.com/thunlp/FewRel
9.Stanford Dogs Data sets
Download address :https://www.kesci.com/mw/dataset/5d22e94e688d36002c55105f
10.Stanford Cars Data sets
Download address :http://ai.stanford.edu/~jkrause/cars/car_dataset.html
边栏推荐
- 2021-04-02
- Introduction to the hardest core PWN in the whole network_ Graphic analysis
- leetcode1221. Split balance string
- SQL lab range explanation
- Database overview
- Matlab notes
- Summary of SQL injection (I)
- Fun CMD command line~
- How to install the blue lake plug-in to support Photoshop CC 2017
- Region of Halcon: generation of multiple regions (3)
猜你喜欢

Startup mode of SoC verification environment

【FLink】access closed classloader classloader. check-leaked-classloader

【图像融合】基于matlab方向离散余弦变换和主成分分析图像融合【含Matlab源码 1907期】

Customize the console plot result style

How micro engine uploads remote attachments

Web3 DApp用户体验最佳实践

Use js to simply implement the apply, call and bind methods

Kotlin compose perfect todo project surface rendering background and shadow

Read the general components of antd source code

Five simple data types of JS
随机推荐
How to make colleagues under the same LAN connect to their own MySQL database
Redis (17)
Penetration test - directory traversal vulnerability
Qdebug June 2022
Fun CMD command line~
olap分析引擎——Kylin4.0
《QDebug 2022年6月》
初识 Flutter 的绘图组件 — CustomPaint
buuctf web
Triangle class (construction and deconstruction)
Swift rapid development
魔法猪系统重装大师怎么使用
There is 404 in the laravel visit, except the home page is redirected; Index php
Vscade setting clang format
Read the general components of antd source code
Summary of SQL injection (I)
Web3 DApp用户体验最佳实践
OOP vector addition and subtraction (friend + copy construction)
Response (XI)
Database low-end SQL query statement fragment