ishmaelbelghazi / ali Goto Github PK

View Code? Open in Web Editor NEW

311.0 311.0 80.0 6.75 MB

Adversarially Learned Inference

License: MIT License

Python 48.82% TeX 51.18%

ali's People

Contributors

Stargazers

Watchers

Forkers

wanjinchang ml-lab wgapl rollingstone ericschles dribnet milestonesvn oplatek mutual-ai hma02 vyraun wangg12 g-wang chouzen zhixinshu leoleishi codeaudit karimpedia tr2000 bowenxu dmitryulyanov deep-learning-cdrone xjwxjw liaoheping psoulos edgarriba pythonai stevenlol jdc08161063 benjamesbabala aiadventures leehomyc seyiqi tomahawklin jimmy-dq niumeng07 iij0 wangyannhao csu-gh canbuoy lyzle spinachr masterist shshim0513 createamind miriamhu ayushjaiswal marzied changyingdu knhuq elgazzarr wanglouis49 zoema zcrwind bluer555 alexanderhanboli thejihuijin cgrambow ykwon0407 hbcbh1999 isr-wang afcarl c-xiaomeng aivanni shubhampachori12110095 bbrito zueigung1419 jlzhang001 weichen86 brucejunlee amarkr1 solchoi rmoraffa marcelomata wongwingtsan classicvalues mwelfert choheek

ali's Issues

Where to use the reparametrization trick

In the decoder module. I found that z is sampled from N(0, 1), so where did you use the reparametrization trick described in formual (2) and (3) in the paper

Semi-supervised learning

I've been trying to reproduce your figures for semi-supervised learning on CIFAR-10 (19.98% with 1000 labels). This result is based on the technique proposed in Salimans et al. (2016), not SVMs. Is there any way you can include your code, or at least any changes to the hyperparameters in ali_cifar10.py?

Thanks in advance for your help.

Discriminator input

In the paper, it's mentioned that the input of discriminator is either joint pairs of q(x~,z) or p(x,z^). However, in the code, it seems (z,z^) and (x,x~) are concatenated and being sent as input to z_discriminator and x_discriminator respectively.

There seems to be a discrepancy here. Please let me know if I'm missing something.

Thanks.

Preprocess_representation has a bug for me

Hi,
I was trying to reproduce the representation learning results of paper. Everything works fine except "preprocess_representations" script. It is leading to this error:

File "scripts/preprocess_representations", line 32, in preprocess_svhn
bricks=[ali.encoder.layers[-9], ali.encoder.layers[-6],
AttributeError: 'GaussianConditional' object has no attribute 'layers'

Any help would be appreciated.

semi-supervised learning

Hello,I read the paper and the source code.And it mentioned 'The last three hidden layers of the encoder as well as its output are concatenated to form a 8960-dimensional feature vector.' in section 4.3 of the paper.Could you please tell me how to compute the dimension?Thanks very much

Choice of Layers for Processing z in Discriminator

In the paper and code you used a set of 1x1 2D convolution layers to process the latent vector z in the discriminator. What was the motivation behind using 2D Convolutions versus fully connected layers or some other kind of convolutional layer? What other architectures did you try, and did you find success with any of those?

Some tested features

Hi @IshmaelBelghazi - I thought it might be useful to document some of the more useful updates in my fork that could be merged back in:

A script that allows training any generic fuel dataset (2fa2727)
CLI improvements exposing --num-epochs, --z-dim, and --splits (67f2c7d)
Options for image_size 128x128 (68cc868, works good) and 256x256 (37aee2f, never converged)
Added --oldmodel option for starting training from existing model (2c80444)

All of this is tested and working well for me. Feel free to steal anything you want, or let me know if you'd like me to prepare a merge request with anything you see that would be useful.

Error while installing ali

$ pip install -e ali
$ Directory 'ali' is not installable. File 'setup.py' not found.

Any ideas of the reason? Thanks!

Pre-trained models

Hi, I am so interested with your work, could you upload your pre-trained models (i.e., ImageNet) online ? Thanks!

batch norm updates

Hi, any hint about how does batch norm update and why you do the following?

bn_updates = [(p, m * 0.05 + p * 0.95) for p, m in pop_updates]

Just trying to mimic the same behavior in Pytorch but not sure what are p and m and when this update happens during the execution.

Model can't be loaded without GPU

I need to load a pretrained model with CPU only, but it appears the cuda arrays are serialized as such. Is there any way around this?

softplus activation

@vdumoulin I'm a bit confused about why are you using softplus activation** for the discriminator since in the paper formulas there no mention or at least I cannot recognize it.

**https://github.com/IshmaelBelghazi/ALI/blob/master/ali/bricks.py#L69-L72

BTW, @vdumoulin we met at your talk in the CVC during the break time :D

pretrained imagenet model?

Could you possibly upload this or provide an external link?

Conditional Generation

I'm interested in getting the update to this codebase that includes the conditional generation, as covered in the more recent version of the paper (related image below). Can you let me know if that will be added to the repo?

dropout behavior?

Hi, I am replicating the code with PyTorch.

But I am not sure about the dropout behavior here. Seems that you apply dropout after all layers of the Discriminator (i.e. conv -> dropout -> bn -> dropout -> leaky relu -> dropout etc.), is that correct?

Also, do you apply any preprocessing on the input data? Seems that you rescale it to [0, 1]?

Thanks!

sign on loss function

Hi, in the paper pseudocode, for Loss_d and Loss_g, the gradient ascent is turned into gradient descent by making the whole loss negative (negative signs in front of every term).

However, in the code, I don't see any of those negative signs. What am I missing?

Thank you!

deserialization of models hangs

Training goes well for me using the scripts in experiments with the latest version of blocks, but then when I run any subsequent command that uses the generated model like scripts/sample or scripts/reconstruct, the command hangs indefinitely. My guess is that the deserialization is getting jammed up.

I can look into it more - not yet familiar with the new tar format - but curious if this might be a known issue.

Fuel version problem

I installed the current development version of fuel, but had some issue in fuel downloading.

$ fuel-download celeba 64
$ fuel-convert celeba 64
$ fuel-download celeba 64 --clear

The error message I got is:
fuel-download: error: unrecognized arguments: 64
if I remove 64, I got:
TypeError: init() got an unexpected keyword argument 'max_value'

Could someone please specify what version or commits of fuel and progressbar should I use? Thanks

ImportError: No module named ali.utils

I followed the same steps in the readme file, but when I run this line

$ THEANORC=theanorc python experiments/ali_cifar10.py

I get:

Traceback (most recent call last):
  File "experiments/ali_cifar10.py", line 3, in <module>
    from ali.utils import get_log_odds, conv_brick, conv_transpose_brick, bn_brick
ImportError: No module named ali.utils

mistake in D(x,z) input size

In table 5 from the paper you state that the input size for D(x,z) is 1024x1x1 which I think it's wrong after looking at the previous output sizes D(x) and D(z). I think that should be 1536x1x1.

Is that assumption correct?