Comments (2)
Hi Ian,
Wonderful to hear you enjoyed our work! Thanks for these comments, I've put together some thoughts below. I'll tag an owner of this shared research repo to close this issue, but feel free to move this to email if you have additional questions (author email address for correspondence is listed in our paper.
- Lottery ticket experiments using one-shot sparsification instead of iterative pruning
I agree, it would be fun to evaluate whether the lottery ticket results hold on these large scale tasks with “one-shot” sparsification. In fact, one of the variants in The lottery ticket hypothesis is whether lottery tickets occur in both one shot pruning and iteratively pruned networks.
However, for both one shot and iteratively pruned networks the authors compare the 1) performance of the sparse substructure trained from scratch (with same weights as initial random initialization) to 2) the performance of the original network.
However, the variant you propose appears to quite different, because you are comparing the performance of the sparse substructure trained from scratch (with same weights as initial random initialization) to the one shot pruned structure at the end of training.
Since both variants would likely perform substantially worse than the original model, it is unclear what information we gain here. I.e., you won't be able to tell whether one’s ability to match accuracy when re-training is a product of your hypothesis or just because the accuracy to match is worse (We suspect it is the latter). It's an interesting question, but I don't see a clear way to clearly disentangle the answer. Still easy to run this variant, and perhaps the results will surprise. :) You can simply run the magnitude pruning for a desired fraction of sparsity once at the end of pruning (I believe by setting the begin_pruning_step and end_pruning_step both to equal one before the last step of training).
- Knowledge reconstitution
Hmmm, this I know less about. I believe Erich Elsen, one of my co-authors worked on a project related to this idea called dense-sparse-dense.
Hope these answers are somewhat helpful. Thanks again Ian for taking the time to put together these thoughts.
from google-research.
I think Sara meant to say that you should set the begin_pruning_step = final_step - 1 and end_pruning_step = final_step to mimic zero-shot pruning. You'll also need to set the threshold_decay parameter to 0, otherwise the threshold won't immediately jump to the necessary value to get the sparsity level you want.
Based on previous experience I've had with zero-shot pruning (see for example the last line of Table 4 in https://arxiv.org/pdf/1704.05119.pdf where the error rate more than doubles at 90% pruning), I would guess that zero-shot pruning will actually lead to worse accuracies than the random fixed sparsity patterns trained from scratch. If you try this, would love to know the results.
from google-research.
Related Issues (20)
- Cheated on husband
- SVDF layer implementation incompatible with SVDF operator from TFLite HOT 4
- Potential performance issue in MONET: plotting slow in matplotlib == 3.3.0
- M.youtube.com
- About inference prompt matcha-chart2text-statista HOT 2
- auto_dropout
- Stochastic muzero
- Your brain got fried HOT 1
- issue-1
- issue-2
- Can't get TiDE to handle future covariates
- Possible error in MobileBert embedding convolution
- fvlm: cannot import name 'clip_utils' from 'utils'
- kws_streaming needs updates to be usable on latest tensorflow
- About the Synthetic Repetition HOT 3
- i3d model's input range for FVD calculation
- Finetuning deplot
- Redundancy of jnp.where operation in d3pm p_logits.
- IABOT
- [email protected]
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from google-research.