Comments (8)
Implementation of GSA in the code is from:
- https://github.com/lucidrains/global-self-attention-network
- https://openreview.net/forum?id=KiFeuZu24k (rejected by ICLR 2021 by the way)
Based on lucidrains' repository, one could refer to this for prior work:
Efficient attention is an attention mechanism that substantially optimizes the memory and computational efficiency while retaining exactly the same expressive power as the conventional dot-product attention.
Apparently, it is a cheaper way to have attention.
It brings attention mechanism to the model, but does not increase its size a lot because it does not add new features, etc.
from lightweight-gan.
Yes using attention does improve quality at least reflected in the FID scores they tend to go lower.
Tradeoff is larger more memory usage and longer training time.
from lightweight-gan.
What is better? Change [32] to [96] or [32,64]? What is the different?
from lightweight-gan.
What is better? Change [32] to [96] or [32,64]? What is the different?
I think it should be a power of 2, so 96 would not be valid.
lightweight-gan/lightweight_gan/lightweight_gan.py
Lines 396 to 401 in 845eb9d
from lightweight-gan.
Of course, but my question about different beween one large value vs. two smaller values
from lightweight-gan.
@Dok11 I think you are misunderstanding the value, it puts multiple attention layers at the resolutions you specify into the neural network graph, so at more resolutions the better of course as you'll get attention at different levels. It's the same as convolutions. If you can only effort one it depends on your training data has it lot of global structure (then lower resolution layer is beneficial) or lot of local structure (then a higher resolution layer is more beneficial)
from lightweight-gan.
I thought same but I hoped someone can help me with some examples with this values.
For example, for some purpose we use param is [32], for other purpose/images we use params if [8,16,32,64].
Maybe have reason to create some synthetic dataset to test this parameters in practice? Like this:
from lightweight-gan.
@Mut1nyJD I still not undrestanding attn layers, but I think have reasonable question. By changing attn-res-layers from [32] to [32,64,128,256] the model file size does not increase more then two megabytes. So does it really must improve quality?
Yes, model trainig requires more memory and time. So I confused, trainig slower, but model size still same size (almost). I think it mean that model doesnt increse own possibilities. How model will make more detailed images with same size..
If you know some sources with simple description of this technique let me know please.
from lightweight-gan.
Related Issues (20)
- question about generated images HOT 35
- Error(s) in loading state_dict for LightweightGAN HOT 2
- Batchnorm layer missing in the discriminator?
- showing results while training ? HOT 2
- G is showing up a negative number. What does that mean?
- Multi GPU utilization problem.
- TypeError: __init__() got an unexpected keyword argument 'hparams' HOT 1
- unable to load save model. please try downgrading the package to the version specified by the saved model HOT 7
- --load_strict=False failing to load model HOT 1
- Can't find "__main__" module (sorry if noob question) HOT 4
- loss implementation differs from paper HOT 1
- Doesn't detect CUDA from Nvidia GPU HOT 1
- Torchvision Assertion error while importing custom data
- Aim installation error HOT 4
- Discriminator Loss converges to 0 while Generator loss pretty high HOT 3
- Dont detects cuda HOT 1
- Using Lightweight GAN as a library
- Projecting generated Images to Latent Space HOT 1
- Executing with a trailing \ in the arguments sets the variable new to the truthy value '\\' and deletes all progress
- CUDA out of memory error while generating interpolations
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lightweight-gan.