Cockpit is a visual and statistical debugger specifically designed for deep learning.

Related tags

Deep Learningcockpit
Overview

Logo

A Practical Debugging Tool for Training Deep Neural Networks

A better status screen for deep learning.

InstallationDocsLicenseCitation

CI Lint Doc Coverage License: MIT Code style: black arXiv


pip install 'git+https://github.com/f-dangel/cockpit.git'

Cockpit is a visual and statistical debugger specifically designed for deep learning. Training a deep neural network is often a pain! Successfully training such a network usually requires either years of intuition or expensive parameter searches involving lots of trial and error. Traditional debuggers provide only limited help: They can find syntactical errors but not training bugs such as ill-chosen learning rates. Cockpit offers a closer, more meaningful look into the training process with multiple well-chosen instruments.


CockpitAnimation

Installation

To install Cockpit simply run

pip install 'git+https://github.com/f-dangel/cockpit.git'

Documentation

The documentation provides a full tutorial on how to get started using Cockpit as well as a detailed documentation of its API.

Experiments

To showcase the capabilities of Cockpit we performed several experiments illustrating the usefulness of our debugging tool. For a discussion of those experiments please refer to our paper.

License

Distributed under the MIT License. See LICENSE for more information.

Citation

If you use Cockpit, please consider citing:

Frank Schneider, Felix Dangel, Philipp Hennig
Cockpit: A Practical Debugging Tool for Training Deep Neural Networks
arXiv 2102.06604

@misc{schneider2021cockpit,
   title={{Cockpit: A Practical Debugging Tool for Training Deep Neural Networks}},
   author={Frank Schneider and Felix Dangel and Philipp Hennig},
   year={2021},
   eprint={2102.06604},
   archivePrefix={arXiv},
   primaryClass={cs.LG}
}
Comments
  • [BUG] Using `Alpha` with other quantities and a custom optimizer

    [BUG] Using `Alpha` with other quantities and a custom optimizer

    No support yet for custom optimizers?


    Error(log below) while trying cockpit with custom optimizer class(that subclasses torch.optim.Optimizer). Although seems to work as expected with in-built optimizers (torch.optim.SGD).

    Log

    Traceback (most recent call last): File "laplace_primer.py", line 137, in sgld.train(prefinal_train, train_loader, model) File "../SGLD_laplace.py", line 123, in train loss.backward(create_graph=cockpit.create_graph(epochs)) File "./miniconda3/envs/backpack/lib/python3.7/site-packages/cockpit/context.py", line 137, in exit self.cp.track(self.global_step, protected_savefields=self.protected_savefields) File "../miniconda3/envs/backpack/lib/python3.7/site-packages/cockpit/cockpit.py", line 178, in track q.track(global_step, self.params, batch_loss) File "../miniconda3/envs/backpack/lib/python3.7/site-packages/cockpit/quantities/quantity.py", line 87, in track iteration, result = self.compute(global_step, params, batch_loss) File "./miniconda3/envs/backpack/lib/python3.7/site-packages/cockpit/quantities/quantity.py", line 516, in compute save_result = self._compute(global_step, params, batch_loss) File "../miniconda3/envs/backpack/lib/python3.7/site-packages/cockpit/quantities/quantity.py", line 538, in _compute self._compute_start(global_step, params, batch_loss) File "../miniconda3/envs/backpack/lib/python3.7/site-packages/cockpit/quantities/alpha.py", line 280, in _compute_start self._save_1st_order_info(global_step, params, batch_loss, point, until) File "../miniconda3/envs/backpack/lib/python3.7/site-packages/cockpit/quantities/alpha.py", line 332, in _save_1st_order_info id(p): batch_size * p.grad_batch.data.clone().detach() for p in params File "../miniconda3/envs/backpack/lib/python3.7/site-packages/cockpit/quantities/alpha.py", line 332, in id(p): batch_size * p.grad_batch.data.clone().detach() for p in params AttributeError: 'Parameter' object has no attribute 'grad_batch'

    🐛 Type: Bug 👍 Status: Done 
    opened by nmndeep 8
  • Plotter crashes when using `show_plot=False, save_plot=True`

    Plotter crashes when using `show_plot=False, save_plot=True`

    Description

    Plotting to a file crashes (see stacktrace below). Calling plot with show_plot=True, save_plot=True works and results in the file being saved correctly. The issue is reproducible over multiple runs and thus probably not caused by weird gradients etc.

    Steps to Reproduce

    1. Setup
    config = [
                    quantities.Distance(schedules.linear(interval=1)),
                    quantities.GradHist1d(schedules.linear(interval=1)),
                    quantities.GradNorm(schedules.linear(interval=1)),
                    quantities.InnerTest(schedules.linear(interval=1)),
                    quantities.Loss(schedules.linear(interval=1)),
                    quantities.NormTest(schedules.linear(interval=1)),
                    quantities.OrthoTest(schedules.linear(interval=1)),
                    quantities.UpdateSize(schedules.linear(interval=1)),
                    quantities.GradHist2d(schedules.linear(interval=1))
                ]
                self._cockpit = Cockpit(model.parameters(), quantities=config)
                self._cockpit_plotter = CockpitPlotter()
    
    1. Track a few backward steps

    2. Call plot with show_plot=False and save_plot=True

    self._cockpit_plotter.plot(self._cockpit,
                                            savedir=cockpit_log_dir,
                                            show_plot=False,
                                            save_plot=True)
    

    Traceback (most recent call last):
      File "code/train.py", line 215, in <module>
        metrics = main(opts)
      File "code/train.py", line 185, in main
        training_info = trainer.train(model, dataset, eval_data, logger)
      File "./code/trainers/mse_trainer.py", line 186, in train
        self.log(logger.root_dir)
      File "./code/trainers/base_trainer.py", line 145, in log
        save_plot=True)
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/plotter.py", line 142, in plot
        self._plot_gradients(self.grid_spec[0, 1])
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/plotter.py", line 223, in _plot_gradients
        instruments.gradient_tests_gauge(self, self.fig, self.gs_gradients[1, 1])
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/instruments/gradient_tests_gauge.py", line 69, in gradient_tests_gauge
        _format(self, ax_all, ax_norm, ax_inner, ax_ortho)
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/instruments/gradient_tests_gauge.py", line 91, in _format
        ax_norm.set_yscale("log")
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/axes/_base.py", line 3708, in set_yscale
        ax.yaxis._set_scale(value, **kwargs)
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/axis.py", line 800, in _set_scale
        self._scale.set_default_locators_and_formatters(self)
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/scale.py", line 406, in set_default_locators_and_formatters
        axis.set_major_locator(LogLocator(self.base))
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/axis.py", line 1651, in set_major_locator
        self.stale = True
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/artist.py", line 230, in stale
        self.stale_callback(self, val)
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/artist.py", line 51, in _stale_axes_callback
        self.axes.stale = val
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/artist.py", line 230, in stale
        self.stale_callback(self, val)
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/figure.py", line 51, in _stale_figure_callback
        self.figure.stale = val
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/artist.py", line 230, in stale
        self.stale_callback(self, val)
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/pyplot.py", line 589, in _auto_draw_if_interactive
        fig.canvas.draw_idle()
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/backend_bases.py", line 1907, in draw_idle
        self.draw(*args, **kwargs)
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/backends/backend_agg.py", line 388, in draw
        self.figure.draw(self.renderer)
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/artist.py", line 38, in draw_wrapper
        return draw(artist, renderer, *args, **kwargs)
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/figure.py", line 1709, in draw
        renderer, self, artists, self.suppressComposite)
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/image.py", line 135, in _draw_list_compositing_images
        a.draw(renderer)
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/artist.py", line 38, in draw_wrapper
        return draw(artist, renderer, *args, **kwargs)
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/axes/_base.py", line 2647, in draw
        mimage._draw_list_compositing_images(renderer, self, artists)
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/image.py", line 135, in _draw_list_compositing_images
        a.draw(renderer)
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/artist.py", line 38, in draw_wrapper
        return draw(artist, renderer, *args, **kwargs)
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/axis.py", line 1203, in draw
        ticks_to_draw = self._update_ticks()
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/axis.py", line 1079, in _update_ticks
        major_locs = self.get_majorticklocs()
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/axis.py", line 1324, in get_majorticklocs
        return self.major.locator()
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/ticker.py", line 2271, in __call__
        return self.tick_values(vmin, vmax)
      File "./envs/pytorch/lib/python3.7/site-packages/matplotlib/ticker.py", line 2297, in tick_values
        "Data has no positive values, and therefore can not be "
    ValueError: Data has no positive values, and therefore can not be log-scaled.
    
    🐛 Type: Bug 🆕 Status: New 
    opened by mseitzer 5
  • [BUG] Extension hook executes on custom module

    [BUG] Extension hook executes on custom module

    Cockpit crashes with non descriptive error.

    Description

    I am trying to run cockpit on a simple network trained with MSE loss (although implemented through custom modules, not nn.Sequential). As far as I can see, no unsupported operations are involved (Linear, Sequential, Tanh, Identity).

    Cockpit crashes on computing the BatchGrad extension. Using backpack to compute the batch gradient works without crashes.

    Steps to Reproduce

    1. Setup
    self.loss_fn = lambda pred, target: ((target - pred) ** 2).mean(dim=-1)
    
    self._cockpit = Cockpit(model.parameters(),
                                        quantities=configuration("economy"))
    model = MyCustomModel()
    model = extend(model)
    optimizer = torch.optim.Adam(params=model.parameters(), lr=1e-3)
    
    1. Training Step
    pred = model(inp)
    loss = self.loss_fn(pred, target)
    mean_loss = loss.mean()
    
    optimizer.zero_grad()
    info = {
        "batch_size": len(loss),
        "individual_losses": loss,
        "loss": mean_loss,
        "optimizer": optimizer,
    }
    with self._cockpit(step, info=info, debug=True):
        create_graph = self._cockpit.create_graph(step)
        mean_loss.backward(create_graph=create_graph)  # CRASH HERE
    optimizer.step()
    

    Source or Possible Fix


    Stacktrace

    [DEBUG, step 0]
     ↪Quantities  : [<cockpit.quantities.alpha.Alpha object at 0x7fa751af5f10>, <cockpit.quantities.distance.Distance object at 0x7fa74e7c0cd0>, <cockpit.quantities.grad_hist.GradHist1d object at 0x7fa74e7c0d10>, <cockpit.quantities.grad_norm.GradNorm object at 0x7fa74e7c0d90>, <cockpit.quantities.inner_test.InnerTest object at 0x7fa74e7c0dd0>, <cockpit.quantities.loss.Loss object at 0x7fa74e7cd050>, <cockpit.quantities.norm_test.NormTest object at 0x7fa74e7cd090>, <cockpit.quantities.ortho_test.OrthoTest object at 0x7fa74e7cd0d0>, <cockpit.quantities.update_size.UpdateSize object at 0x7fa74e7cd110>]
     ↪Extensions  : [<backpack.extensions.firstorder.batch_grad.BatchGrad object at 0x7fa74e7cd750>]
     ↪Hooks       : <cockpit.quantities.utils_transforms.BatchGradTransformsHook object at 0x7fa74e7931d0>
     ↪Create graph: False
     ↪Save memory : True
    [DEBUG] Running extension <backpack.extensions.firstorder.batch_grad.BatchGrad object at 0x7fa74e7cd750> on Linear(in_features=128, out_features=1, bias=True)
    [DEBUG] Running extension <backpack.extensions.firstorder.batch_grad.BatchGrad object at 0x7fa74e7cd750> on LinearHead(
      (l): Linear(in_features=128, out_features=1, bias=True)
    )
    [DEBUG] Running extension <backpack.extensions.firstorder.batch_grad.BatchGrad object at 0x7fa74e7cd750> on Identity()
    [DEBUG] Running extension <backpack.extensions.firstorder.batch_grad.BatchGrad object at 0x7fa74e7cd750> on Sequential(
      (0): Linear(in_features=1, out_features=128, bias=True)
      (1): Tanh()
      (2): Linear(in_features=128, out_features=128, bias=True)
      (3): Tanh()
      (4): LinearHead(
        (l): Linear(in_features=128, out_features=1, bias=True)
      )
      (5): Identity()
    )
    [DEBUG] Running extension <backpack.extensions.firstorder.batch_grad.BatchGrad object at 0x7fa74e7cd750> on MLP(
      (layers): Sequential(
        (0): Linear(in_features=1, out_features=128, bias=True)
        (1): Tanh()
        (2): Linear(in_features=128, out_features=128, bias=True)
        (3): Tanh()
        (4): LinearHead(
          (l): Linear(in_features=128, out_features=1, bias=True)
        )
        (5): Identity()
      )
    )
    Traceback (most recent call last):
      File "./envs/pytorch/lib/python3.7/site-packages/backpack/__init__.py", line 169, in run_extension_hook
        CTX.get_extension_hook()(module)
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/quantities/hooks/base.py", line 53, in __call__
        self.run_hook(param, module)
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/quantities/hooks/base.py", line 80, in run_hook
        value = self.module_hook(param, module)
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/quantities/hooks/base.py", line 139, in module_hook
        return self.param_hook(param)
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/quantities/utils_transforms.py", line 78, in param_hook
        param.grad_batch._param_weakref = weakref.ref(param)
    AttributeError: 'Parameter' object has no attribute 'grad_batch'
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "./code/trainers/base_trainer.py", line 112, in backward
        mean_loss.backward(create_graph=create_graph)
      File "./envs/pytorch/lib/python3.7/site-packages/torch/tensor.py", line 221, in backward
        torch.autograd.backward(self, gradient, retain_graph, create_graph)
      File "./envs/pytorch/lib/python3.7/site-packages/torch/autograd/__init__.py", line 132, in backward
        allow_unreachable=True)  # allow_unreachable flag
      File "./envs/pytorch/lib/python3.7/site-packages/backpack/__init__.py", line 151, in hook_run_extensions
        run_extension_hook(module)
      File "./envs/pytorch/lib/python3.7/site-packages/backpack/__init__.py", line 172, in run_extension_hook
        raise RuntimeError(f"Post extensions hook failed: {message}")
    RuntimeError: Post extensions hook failed: AttributeError("'Parameter' object has no attribute 'grad_batch'")
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "code/train.py", line 215, in <module>
        metrics = main(opts)
      File "code/train.py", line 185, in main
        training_info = trainer.train(model, dataset, eval_data, logger)
      File "./code/trainers/mse_trainer.py", line 128, in train
        self.backward(global_step, loss, mean_loss, optimizer)
      File "./code/trainers/base_trainer.py", line 112, in backward
        mean_loss.backward(create_graph=create_graph)
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/context.py", line 137, in __exit__
        self.cp.track(self.global_step, protected_savefields=self.protected_savefields)
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/cockpit.py", line 178, in track
        q.track(global_step, self.params, batch_loss)
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/quantities/quantity.py", line 87, in track
        iteration, result = self.compute(global_step, params, batch_loss)
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/quantities/quantity.py", line 516, in compute
        save_result = self._compute(global_step, params, batch_loss)
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/quantities/quantity.py", line 538, in _compute
        self._compute_start(global_step, params, batch_loss)
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/quantities/alpha.py", line 280, in _compute_start
        self._save_1st_order_info(global_step, params, batch_loss, point, until)
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/quantities/alpha.py", line 326, in _save_1st_order_info
        grad_dict = {id(p): p.grad.data.clone().detach() for p in params}
      File "./envs/pytorch/lib/python3.7/site-packages/cockpit/quantities/alpha.py", line 326, in <dictcomp>
        grad_dict = {id(p): p.grad.data.clone().detach() for p in params}
    AttributeError: 'NoneType' object has no attribute 'data'
    
    
    🐛 Type: Bug 👷 Status: In Progress 
    opened by mseitzer 5
  • GSNR computation incorrect?

    GSNR computation incorrect?

    I have a question regarding the scaling of the GSNR computation as provided in Cockpit

    Description

    In the following gist, I've written 3 different ways of computing the GSNR as defined in this paper, the same one cited in the Cockpit paper:

    1. Extracting the variance directly from the sample-wise gradients, via BackPACK BatchGrad() extension (bad, memory inefficient)
    2. Using the Variance() BackPACK extension (efficient & straightforward)
    3. Using a replacement formula for the variance to avoid sample-wise grad squaring: Var(X) = E[X²] - E[X]² (likely faster, less straightforward)

    https://gist.github.com/andres-fr/db9d0ba31d1502df62a09d382e504d1e

    Specifically, the first and second reflect the mathematical definition from the paper:

    GSNR_paper = grad_squared / Var(gradients)
    

    And this is the implementation of the replacement formula that I found to be equivalent:

    GSNR_gist = grad_squared / ( (sum_grad_squared / batch_size) - (grad_squared / (batch_size**2)) + epsilon)
    

    My issue/question is when I compare the last expression to the one in the Cockpit source code, a clear difference can be noticed in the batch scaling:

    GSNR_cockpit = grad_squared / (batch_size * sum_grad_squared - grad_squared + epsilon)
    

    When I used this last expression in my gist, it led to results clearly different from the other approaches. Am I missing something here? Is cockpit doing some further scaling under the hood? Or are these results scaled differently from the definition for some reason?


    Steps to Reproduce

    The gist works on CPU and CUDA. For easier reproduction, the gist depends on PyTorch and BackPACK only. The script docstring includes steps to install a minimal working conda environment. Then, simply run the script.

    Source or Possible Fix

    The BackPACK docs already warned about scaling issues with the Variance, which may even involve the loss function, so I wonder if this is a bug/feature.

    Intuitively, I'd say it is good if the quantity matches the definition from the paper, but again, implicit scaling may be getting in the way?

    🐛 Type: Bug 🆕 Status: New 
    opened by andres-fr 4
  • Plots from remote machine

    Plots from remote machine

    I am running some experiments on a remote machine, and I can't see any cockpit plot. I have the same problem when running the basic example. Is there a way of solving this? Am I missing something?

    This may be a dumb question, sorry I am quite new to this.

    Thank you!

    ❔ Type: Question 🐛 Type: Bug 👷 Status: In Progress 
    opened by Niccolo-Ajroldi 4
  • Monai 3D Resnet models

    Monai 3D Resnet models

    Hi!

    Thank you for this project, indeed it is quite interesting work.

    I am curious to know that if we can use it to interpret/observe the training of a 3D resnet model?

    Regards

    ❔ Type: Question 
    opened by Mushtaqml 4
  • No explicit loss and individual loss

    No explicit loss and individual loss

    Description

    What should I do if my model returns the tuple of loss and individual losses and I have no explicit loss function class?

    At the moment if I just extend the model, I get:

    cockpit/quantities/alpha.py in <dictcomp>(.0)
        324         self.save_to_cache(global_step, f"params_{point}", params_dict, block_fn)
        325 
    --> 326         grad_dict = {id(p): p.grad.data.clone().detach() for p in params}
        327         self.save_to_cache(global_step, f"grad_{point}", grad_dict, block_fn)
        328 
    
    AttributeError: 'NoneType' object has no attribute 'data'
    

    Thanks for any suggestions!

    🐛 Type: Bug 
    opened by kashif 4
  • Allow specifying extra parameters to `fig.savefig` in `CockpitPlotter.plot`

    Allow specifying extra parameters to `fig.savefig` in `CockpitPlotter.plot`

    Description

    I suggest to extend CockpitPlotter.plot such that a keyword dictionary can be passed to it specifying parameters to fig.savefig. This is to be able to choose 1) a DPI value or 2) a different file format (e.g. PDF).

    Currently, the stored image files have a poor resolution and thus are not very readable. Being able to choose a DPI value or saving to PDF would fix this issue.

    🆕 Status: New 
    opened by mseitzer 4
  • Plotter accepts savedir parameter but turns it into file path

    Plotter accepts savedir parameter but turns it into file path

    Description

    Plotter has a savedir parameter for which the documentation says it specifies the directory where to save the plot to. However, the _save function interprets this parameter as a file path: https://github.com/f-dangel/cockpit/blob/937b3eac8d9fef7e6e6fbc9c91863dd08b1362b1/cockpit/plotter.py#L400-L405

    This means plotting with savedir='/example/path' would result in a file /example/path__primary.png, where /example/path/__primary.png would be expected.

    Besides, when choosing to not display the plot, I don't think __primary or __secondary should be added to the filename. From an API standpoint, it would probably be best to let the user flexibly choose the path and filename to save to (potentially adding the image extension).

    🐛 Type: Bug 🆕 Status: New 
    opened by mseitzer 3
  • Learning rates and HyperParameters

    Learning rates and HyperParameters

    Hi!

    I am trying to replicate the plots you show, but I'm not able to find Learning Rates and HyperParameters on the json file created. Where can I find them?

    Also, I was wondering if I can find the descriptions of the quantities somewhere in the documentation. I found a table with names and descriptions, but for example the dimensions are not explained and for some quantities I am not able to understand why they have such dimensions on the json file.

    Thanks a lot!

    Bests, Cristian

    ❔ Type: Question 
    opened by Cmeo97 2
  • Gradient analysis from named parameters without backpack

    Gradient analysis from named parameters without backpack

    I have cockpit mostly working except I get this warnings like this because I have not implemented backpack's batch grad for all my modules:

    /usr/local/lib/python3.9/site-packages/backpack/extensions/backprop_extension.py:106: UserWarning: Extension saving to grad_batch does not have an extension for Module <class 'pycompress.model_parts.PredHist'> although the module has parameters 
    

    So as a result I am not able to get all the quantities without error. I understand backpack is central to the design of cockpit but I am fine with some things not working when I just want to prototype quickly.

    My questions/requests are:

    1. I still see the gradient histogram. Is that histogram only including the gradients for modules where the backpack extension worked or is it all of them?
    2. Have you thought about integrating the much smaller project https://github.com/alwynmathew/gradflow-check. They provide a chart that lets you check for vanishing gradient and they do it without backpack based on just having the named_parameters
    3. Unrelated but btw: Have you thought about including the ESD of the weight matrices as done in? https://github.com/CalculatedContent/WeightWatcher

    Thank you for your excellent work on this project!

    ❔ Type: Question 
    opened by jogardi 2
  • Support for Transformers

    Support for Transformers

    Requesting a new feature (I can't seem to add a label myself).

    Description

    Do you currently support Huggingface Transformers? In particular, I'd like to debug T5.

    I tried an editable install of Transformers, and modifying the PyTorch implementation underneath, but I can't quite get the two different loss functions to play together properly. For example, I'm trying to steal this implementation from the Basic Examples, and add it to Transformers:

    loss_fn = extend(torch.nn.CrossEntropyLoss(reduction="mean"))
    individual_loss_fn = torch.nn.CrossEntropyLoss(reduction="none")
    

    After 30 minutes of tinkering, I can't get arround this error. Thanks for any assistance.

    AssertionError: BackPACK extension expects a backpropagation quantity but it is None. Module: Linear(in_features=768, out_features=32128, bias=False), Extension: <backpack.extensions.secondorder.diag_hessian.DiagHessian object at 0x7f6a72f102b0>.
    
    During handling of the above exception, another exception occurred:
    
    AttributeError                            Traceback (most recent call last)
    Cell In[6], line 80
         69 # loss.sum().backward()
         70 # backward pass
         71 with cockpit(
         72     global_step,
         73     info={
       (...)
         78     },
         79 ):
    ---> 80     loss.sum().backward(create_graph=cockpit.create_graph(global_step))
         82 # optimizer step
         83 optimizer.step()
    
    File ~/githubs/cockpit/cockpit/context.py:155, in BackwardCTX.__exit__(self, type, value, traceback)
        152 for ctx in self.contexts:
        153     ctx.__exit__(type, value, traceback)
    --> 155 self.cp.track(self.global_step, protected_savefields=self.protected_savefields)
        157 CockpitCTX.erase()
    
    File ~/githubs/cockpit/cockpit/cockpit.py:195, in Cockpit.track(self, global_step, protected_savefields)
        190 before_cleanup = [
        191     q for q in self.quantities if not isinstance(q, quantities.HessMaxEV)
        192 ]
        194 for q in before_cleanup:
    --> 195     q.track(global_step, self.params, batch_loss)
        197 self._free_backpack_buffers(global_step, protected_savefields)
        199 after_cleanup = [
        200     q for q in self.quantities if isinstance(q, quantities.HessMaxEV)
        201 ]
    
    File ~/githubs/cockpit/cockpit/quantities/quantity.py:101, in Quantity.track(self, global_step, params, batch_loss)
         92 """Perform scheduled computations and store result.
         93 
         94 Args:
       (...)
    ...
    --> 335 grad_dict = {id(p): p.grad.data.clone().detach() for p in params}
        336 self.save_to_cache(global_step, f"grad_{point}", grad_dict, block_fn)
        338 # L = ¹/ₙ ∑ᵢ ℓᵢ, BackPACK's BatchGrad computes ¹/ₙ ∇ℓᵢ, we have to rescale
    
    AttributeError: 'NoneType' object has no attribute 'data'
    

    Update: after 2 more hours of tinkering, I'm pretty sure I am exactly matching your examples, but I'm still facing the exact same issue.

    BTW, I think I found this project at NeurIPS. Great work on it!

    🆕 Status: New 
    opened by KastanDay 1
Releases(v1.0.2)
Owner
Felix Dangel
Machine Learning PhD student at the University of Tübingen and the Max Planck Institute for Intelligent Systems.
Felix Dangel
This repository contains the implementation of the following paper: Cross-Descriptor Visual Localization and Mapping

Cross-Descriptor Visual Localization and Mapping This repository contains the implementation of the following paper: "Cross-Descriptor Visual Localiza

Mihai Dusmanu 81 Oct 06, 2022
本步态识别系统主要基于GaitSet模型进行实现

本步态识别系统主要基于GaitSet模型进行实现。在尝试部署本系统之前,建立理解GaitSet模型的网络结构、训练和推理方法。 系统的实现效果如视频所示: 演示视频 由于模型较大,部分模型文件存储在百度云盘。 链接提取码:33mb 具体部署过程 1.下载代码 2.安装requirements.txt

16 Oct 22, 2022
PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

Don’t be Contradicted with Anything!CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System This repository contains the PyTorch im

Libo Qin 25 Sep 06, 2022
[ICLR 2021] "CPT: Efficient Deep Neural Network Training via Cyclic Precision" by Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin

CPT: Efficient Deep Neural Network Training via Cyclic Precision Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin Accep

26 Oct 25, 2022
A repository for benchmarking neural vocoders by their quality and speed.

License The majority of VocBench is licensed under CC-BY-NC, however portions of the project are available under separate license terms: Wavenet, Para

Meta Research 177 Dec 12, 2022
Exploring the Dual-task Correlation for Pose Guided Person Image Generation

Dual-task Pose Transformer Network The source code for our paper "Exploring Dual-task Correlation for Pose Guided Person Image Generation“ (CVPR2022)

63 Dec 15, 2022
[EMNLP 2020] Keep CALM and Explore: Language Models for Action Generation in Text-based Games

Contextual Action Language Model (CALM) and the ClubFloyd Dataset Code and data for paper Keep CALM and Explore: Language Models for Action Generation

Princeton Natural Language Processing 43 Dec 16, 2022
Semantic Scholar's Author Disambiguation Algorithm & Evaluation Suite

S2AND This repository provides access to the S2AND dataset and S2AND reference model described in the paper S2AND: A Benchmark and Evaluation System f

AI2 54 Nov 28, 2022
Generate images from texts. In Russian. In PaddlePaddle

ruDALL-E PaddlePaddle ruDALL-E in PaddlePaddle. Install: pip install rudalle_paddle==0.0.1rc1 Run with free v100 on AI Studio. Original Pytorch versi

AgentMaker 20 Oct 18, 2022
Repository For Programmers Seeking a platform to show their skills

Programming-Nerds Repository For Programmers Seeking Pull Requests In hacktoberfest ❓ What's Hacktoberfest 2021? Hacktoberfest is the easiest way to g

42 Oct 29, 2022
Weakly Supervised Text-to-SQL Parsing through Question Decomposition

Weakly Supervised Text-to-SQL Parsing through Question Decomposition The official repository for the paper "Weakly Supervised Text-to-SQL Parsing thro

14 Dec 19, 2022
Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

PyTorch code to reproduce LyDROO algorithm [1], which is an online computation offloading algorithm to maximize the network data processing capability subject to the long-term data queue stability an

Liang HUANG 87 Dec 28, 2022
Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021)

EMI-FGSM This repository contains code to reproduce results from the paper: Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021) Xiaosen Wa

John Hopcroft Lab at HUST 10 Sep 26, 2022
Collision risk estimation using stochastic motion models

collision_risk_estimation Collision risk estimation using stochastic motion models. This is a new approach, based on stochastic models, to predict the

Unmesh 7 Jun 26, 2022
An intelligent, flexible grammar of machine learning.

An english representation of machine learning. Modify what you want, let us handle the rest. Overview Nylon is a python library that lets you customiz

Palash Shah 79 Dec 02, 2022
NL-Augmenter 🦎 → 🐍 A Collaborative Repository of Natural Language Transformations

NL-Augmenter 🦎 → 🐍 The NL-Augmenter is a collaborative effort intended to add transformations of datasets dealing with natural language. Transformat

684 Jan 09, 2023
competitions-v2

Codabench (formerly Codalab Competitions v2) Installation $ cp .env_sample .env $ docker-compose up -d $ docker-compose exec django ./manage.py migrat

CodaLab 21 Dec 02, 2022
Memory-Augmented Model Predictive Control

Memory-Augmented Model Predictive Control This repository hosts the source code for the journal article "Composing MPC with LQR and Neural Networks fo

Fangyu Wu 1 Jun 19, 2022
Human Detection - Pedestrian Detection using OpenCV Python

Pedestrian Detection using OpenCV Python Follow us on Instagram for Machine Lear

Hrishikesh Dutta 1 Jan 23, 2022
Memory-efficient optimum einsum using opt_einsum planning and PyTorch kernels.

opt-einsum-torch There have been many implementations of Einstein's summation. numpy's numpy.einsum is the least efficient one as it only runs in sing

Haoyan Huo 9 Nov 18, 2022