Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

condition to regrow the connection!! #21

Open
vkvats opened this issue Oct 9, 2020 · 1 comment
Open

condition to regrow the connection!! #21

vkvats opened this issue Oct 9, 2020 · 1 comment
Labels
question Further information is requested

Comments

@vkvats
Copy link

vkvats commented Oct 9, 2020

This is a great paper, full of information and ideas. Thank you for this amazing work.

While reading I came across this line, "we want to look at the momentum magnitude of “missing” or zero-valued weights, that is, we want to look at those weights which have been excluded from training before." I was wondering if there is some weight which has large momentum, assuming that this momentum value has gathered over several updates and not just a single update, why were these connections missing in the first place?? is it because connection regrowing step is not done more frequently?? and if this is the reason, then can regrowing connection more frequently give faster convergence??

Thank you for your time and attention.
Vibhas.

@TimDettmers TimDettmers added the question Further information is requested label Oct 14, 2020
@TimDettmers
Copy link
Owner

The "missing" weight are weights that are 0. The gradient can still be calculated from those weights and the momentum is just the exponential mean of these gradients over time. As such, it is possible that these momentum buffers for these missing weights are large. Regrowing connections might yield faster convergence. I have tried it a little bit and did not see any obvious improvement, but I also did not test it carefully. I think it is an open research question of how the frequency of changing the sparsity pattern relates to time to convergence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants