Release v1.0: Bug fixes, New ImageNet Baselines

Bug fixes:

Changed to boolean indexing for PyTorch 1.2 compatibility.

Fixed an error where an error can occur for global pruning algorithms if very few weights were removed for a layer.

Removed momentum reset. This feature did not have any effect on performance and made the algorithm more complex.

Fixed an error where two layers of VGG16 were removed by use of the remove_weight_partial_name() function. Results were slightly degraded, but weights needed for dense performance and relative ordering compared to other methods remained the same.

Features:

Evaluation script can now aggregate log files organized in a folder hierarchy. For each folder results will be aggregated.

Added decay schedule argument. One can choose between Linear and Cosine prune rate decay schedules.

Added new ImageNet baseline which is based on the codebase of Mostafa & Wang, 2019.

Added a max-thread argument which can be used to set the total maximum data loader threads for training, validation and test set data loaders.

Release v0.2: FP16 support, modularity of prune/growth/redistribution algorithms.

Bug fixes:

Fixed a but where magnitude pruning pruned too many parameters when the weight was dense (>95% density) and the pruning rate was small (<5%).
First experiments on LeNet-5 Caffe indicate that this change did not affect performance for networks that learn to have dense weights.
I will replicate this across architectures to make sure this bugfix does not change performance.
Fixed instabilities in SET (sparse evolutionary training) pruning which could cause nan values in specific circumstances.
Fixed a bug where global pruning would throw an error if a layer was fully dense and had a low prune rate.

Documentation:

Added basic docstring documentation

Features:

MNIST/CIFAR: Separate log files are not created for different models/densities/names.
MNIST/CIFAR: Aggregate mean test accuracy with standard errors can now be automatically extracted from logs with python get_results_from_logs.py.
Added FP16 support. Any model can now be run in 16-bit by passing the apex FP16_Optimizer into the Masking class and replacing loss.backward() with optimizer.backward(loss).
Added adapted Dynamic Sparse Reparameterization codebase that works with sparse momentum.
Added modular architecture for growth/prune/redistribution algorithms which is decoupled from the main library. This enables you to write your own prune/growth/redistribution algorithms without touched the library internals. A tutorial on how to add your own functions was also added: How to Add Your Own Algorithms.

API:

Changed names from "death" to "prune" to be more consistent with the terminology in the paper.
Added --verbose argument to print the parameter distribution before/after pruning at the end of each epoch. By default, the pruning distribution will no longer be printed.
Removed --sparse flag and added --dense flag. The default is args.dense==False and thus sparse mode is enabled by default. To run a dense model just pass the --dense argument.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release v1.0: Bug fixes, New ImageNet Baselines

Bug fixes:

Features:

Release v0.2: FP16 support, modularity of prune/growth/redistribution algorithms.

Bug fixes:

Documentation:

Features:

API:

Releases: TimDettmers/sparse_learning

Release v1.0

Release v1.0: Bug fixes, New ImageNet Baselines

Bug fixes:

Features:

Release v0.2

Release v0.2: FP16 support, modularity of prune/growth/redistribution algorithms.

Bug fixes:

Documentation:

Features:

API: