当前位置：网站首页>torch. utils. data. Dataloader() details [pytoch getting started manual]

torch. utils. data. Dataloader() details [pytoch getting started manual]

2022-06-10 15:53:00 【Classmate K】

List of articles

The function prototype

DataLoader(dataset, batch_size=1, shuffle=False, sampler=None,
           batch_sampler=None, num_workers=0, collate_fn=None,
           pin_memory=False, drop_last=False, timeout=0,
           worker_init_fn=None, *, prefetch_factor=2,
           persistent_workers=False)

function

Encapsulate the data into... According to the custom format Tensor.

Parameter description

dataset (Dataset) – dataset from which to load the data.
The dataset from which to load data .
batch_size (int, optional) – how many samples per batch to load (default: 1).
How many samples should be loaded in each batch
shuffle (bool, optional) – set to True to have the data reshuffled at every epoch (default: False).
Set to True So that the data will be reshuffled in each period
sampler (Sampler or Iterable, optional) – defines the strategy to draw samples from the dataset. Can be any Iterable with len implemented. If specified, shuffle must not be specified.
Define a strategy for extracting samples from a dataset
batch_sampler (Sampler or Iterable, optional) – like sampler, but returns a batch of indices at a time. Mutually exclusive with batch_size, shuffle, sampler, and drop_last.
Similar to sampler , But return a batch of indexes at a time . And batch_size,shuffle,sampler and drop_last Mutually exclusive .
num_workers (int, optional) – how many subprocesses to use for data loading. 0 means that the data will be loaded in the main process. (default: 0)
How many sub processes are used for data loading . 0 Indicates that data will be loaded in the main process . （ The default value is ：0）
collate_fn (callable, optional) – merges a list of samples to form a mini-batch of Tensor(s). Used when using batched loading from a map-style dataset.
Merge sample lists to form small batches of tensors .
pin_memory (bool, optional) – If True, the data loader will copy Tensors into CUDA pinned memory before returning them. If your data elements are a custom type, or your collate_fn returns a batch that is a custom type.
If True, The data loader copies the tensor to before returning it CUDA Fixed memory . If your data element is a custom type , Or your collate_fn What is returned is a custom type batch
drop_last (bool, optional) – set to True to drop the last incomplete batch, if the dataset size is not divisible by the batch size. If False and the size of dataset is not divisible by the batch size, then the last batch will be smaller. (default: False)
If the dataset size cannot be divided by the batch size , Is set to True To delete the last incomplete batch . If False And the size of the data set cannot be divided by the batch size , The last batch will be smaller .
timeout (numeric, optional) – if positive, the timeout value for collecting a batch from workers. Should always be non-negative. (default: 0)
If is positive , Is the timeout value of collecting batches from staff . Should always be non negative . （ The default value is ：0）
worker_init_fn (callable, optional) – If not None, this will be called on each worker subprocess with the worker id (an int in [0, num_workers - 1]) as input, after seeding and before data loading. (default: None)
prefetch_factor (int, optional, keyword-only arg) – Number of samples loaded in advance by each worker. 2 means there will be a total of 2 * num_workers samples prefetched across all workers. (default: 2)
The number of pre loaded samples per sub process . 2 Indicates that a total of... Will be prefetched in all subprocesses 2 * num_workers Samples . （ The default value is ：2）
persistent_workers (bool, optional) – If True, the data loader will not shutdown the worker processes after a dataset has been consumed once. This allows to maintain the workers Dataset instances alive. (default: False)
If True, After the data set is used once , The data loader will not shut down the worker process . This will enable Worker Dataset Instance remains active . （ The default value is ：False）