Coder Social home page Coder Social logo

pytorch_flows's Introduction

Normalizing flows with PyTorch

Implementation and tutorials of normalizing flows with the novel distributions module. The current set of tutorials and implementations is

  1. Implementing and optimizing planar flows
  2. Different types of invertible flows (radial, batchnorm, affine)
  3. Using flows in variational inference (VAEs)
  4. Auto-regressive types of flows (RealNVP, MAF, IAF) Still very much drafty work in progress
  5. More advanced and recent type of flows

Sorry everyone for the very long delay, we shall try to finish this tutorial session with new advances in generative flows (GLOW) and more advanced ideas (NODE, FFJORD) in the upcoming weeks :)

pytorch_flows's People

Contributors

acids-ircam avatar tuelwer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch_flows's Issues

KL Formula

Hi,
Thanks for the tutorial.

I have a question. Why in the loss function you used the target distribution instead of the base distribution ?
I remember that in the forward KL divergence there is no term for target distribution.

Thanks

Optimizing without known target density

Hi

All tutorials I've seen regarding normalizing flows use loss with known target density, yours too, e.g.

def loss(density, zk, log_jacobians):
    sum_of_log_jacobians = sum(log_jacobians)
    return (-sum_of_log_jacobians - torch.log(density(zk) + 1e-9)).mean() # << density(zk)

How to optimize for data points instead? All that I can think of is just computing inverse of data sample into noise distribution and then optimizing log probability, is it correct way to approach it?

Bug issue when running flows_01.ipynb

Hi!! Thanks for sharing this normalizing flow model with pytorch, it is really exciting.

When running the fifth cell of code a Value Error appears concerning the q1 distribution. It seems rare because the empirical values don't throw any problem. It may be a deprecated version of pytorch used back then. The following message appears:

ValueError Traceback (most recent call last)
in ()
1 q0_density = torch.exp(q0.log_prob(torch.Tensor(x))).numpy()
----> 2 q1_density = torch.exp(q1.log_prob(torch.Tensor(x))).numpy()
3 fig, (ax1, ax2) = plt.subplots(1, 2, sharex=True, figsize=(15, 5))
4 ax1.plot(x, q0_density); ax1.fill_between(x, q0_density, 0, alpha=0.5)
5 ax1.set_title('$q_0 = \mathcal{N}(0,1)$', fontsize=18);

1 frames
/usr/local/lib/python3.7/dist-packages/torch/distributions/distribution.py in _validate_sample(self, value)
287 if not valid.all():
288 raise ValueError(
--> 289 "Expected value argument "
290 f"({type(value).name} of shape {tuple(value.shape)}) "
291 f"to be within the support ({repr(support)}) "

ValueError: Expected value argument (Tensor of shape (1000,)) to be within the support (GreaterThan(lower_bound=0.0)) of the distribution TransformedDistribution(), but found invalid values.

The expected behaviour would be a density distribution for q1.

`flow.final_density.log_prob` strange behaviour

Hi @esling ,

First of all, thanks for sharing these notebooks, they're great.

I'm playing around with the classes you defined, and trying to understand why you're calculating densities manually (explicitly through the Change of Variables formula), rather than using flow.final_density.log_prob.

I thought these should yield the same result, but I tried it and they didn't:

Below are the definitions for the classes I used (mostly equal to yours, but I had to make a small change to the signature of the log_abs_det_jacobian method signature, because according to this , PyTorch expects inputs and ouputs with that method. I think the reason this isn't working is because I might have misunderstood this method.

Click to expand class definitions.
class Flow(transform.Transform, nn.Module):
    """
    purpose of this class is to make `transform.Transform` 'trainable'
    
    simple flows will inherit it
    """
    
    def __init__(self):
        transform.Transform.__init__(self)
        nn.Module.__init__(self)
        #self.bijective = True
    
    # Init all parameters
    def init_parameters(self):
        for param in self.parameters():
            param.data.uniform_(-0.01, 0.01)
            
    # Hacky hash bypass
    def __hash__(self):
        return nn.Module.__hash__(self)

# Flow version of Leaky ReLU
class PReLUFlow(Flow):
    def __init__(self, dim):
        super(PReLUFlow, self).__init__()
        self.alpha = nn.Parameter(torch.Tensor([1]))
        self.bijective = True

    def init_parameters(self):
        for param in self.parameters():
            param.data.uniform_(0.01, 0.99)

    def _call(self, z):
        return torch.where(z >= 0, z, torch.abs(self.alpha) * z)

    def _inverse(self, z):
        return torch.where(z >= 0, z, torch.abs(1. / self.alpha) * z)

    def log_abs_det_jacobian(self, z, y):
        """
        I had to add a dummy "y" var to the method signature because pytorch expects it,
        as per: https://pytorch.org/docs/stable
/distributions.html?highlight=distributions%20transforms#torch.distributions.transforms.Transform.log_abs_det_jacobian
        """
        I = torch.ones_like(z)
        J = torch.where(z >= 0, I, self.alpha * I)
        log_abs_det = torch.log(torch.abs(J) + 1e-5)
        return torch.sum(log_abs_det, dim = 1)

# Main class for normalizing flow
class NormalizingFlow(nn.Module):

    def __init__(self, dim, blocks, flow_length, density):
        super().__init__()
        biject = []
        for f in range(flow_length):
            for b_flow in blocks:
                biject.append(b_flow(dim))
        self.transforms = transform.ComposeTransform(biject)
        self.bijectors = nn.ModuleList(biject)
        self.base_density = density
        self.final_density = distrib.TransformedDistribution(density, self.transforms)
        self.log_det = []

    def forward(self, z):
        self.log_det = []
        # Applies series of flows
        for b in range(len(self.bijectors)):
            y = self.bijectors[b](z)
            self.log_det.append(self.bijectors[b].log_abs_det_jacobian(z, y))
            z = y
        return z, self.log_det

I then tried plotting the density of a simple NormalizingFlow using your approach in the notebooks, and then using flow.final_density.log_prob because I think they should yield the same, but they clearly don't:

Click to expand the definition of the `nflow_change_density` function. It's basically the same as your `change_density` function, but made to work on `NormalizingFlow` instances, rather than `Flow` instances.
def nflow_change_density(flow, z):
    """
    changed this function to work on a `NormalizingFlow` instance rather than a `Flow` instance
    """
    # Apply our transform on coordinates
    f_z, log_det = flow(torch.Tensor(z))
    f_z = f_z.detach()
    log_det = log_det[0].detach()
    
    
    q0_density = flow.base_density.log_prob(torch.Tensor(z)).detach().exp()
    
    # Obtain our density
    q1_density = q0_density.squeeze() / np.exp(log_det.squeeze())
    return q1_density, f_z

First using your approach:

nflow = NormalizingFlow(
    dim=2, 
    blocks=[PReLUFlow],
    flow_length=1,
    density=distrib.MultivariateNormal(torch.zeros(2), torch.eye(2))
)

nflow.bijectors[0].alpha.data = torch.Tensor([0.6])


q0_density = nflow.base_density.log_prob(torch.Tensor(z)).exp().detach()
q1_density, f_z = nflow_change_density(nflow, z)
>>> q1_density
tensor([4.9750e-08, 5.1368e-08, 5.3034e-08,  ..., 1.9093e-08, 1.8493e-08,
        1.7910e-08])
# Plot this
fig, (ax1, ax2) = plt.subplots(1, 2, sharey=True, sharex=True, figsize=(15, 5))
ax1.hexbin(z[:,0], z[:,1], C=q0_density.numpy().squeeze(), cmap='rainbow')
ax1.set_title('$q_0 = \mathcal{N}(\mathbf{0},\mathbb{I})$', fontsize=18);
ax2.hexbin(f_z[:,0], f_z[:,1], C=q1_density.numpy().squeeze(), cmap='rainbow')
ax2.set_title('$q_1=prelu(q_0)$', fontsize=18);

image

And now using nflow.final_density.log_prob :

f_z, _ = nflow(torch.Tensor(z))
f_z = f_z.detach()

q1_density = nflow.final_density.log_prob(f_z).detach()
>>>q1_density
tensor([510744.4062, 510744.4375, 510744.4688,  ..., 510744.4688,
        510744.4375, 510744.4062])

These are crazy high, compared to the values obtain via your approach, and they're log_probs, I didn't even .exp() them yet. And I won't because it'll blow up. I'm gonna plot them (and the gaussian is also in log_probs here):

# Plot this
fig, (ax1, ax2) = plt.subplots(1, 2, sharey=True, sharex=True, figsize=(15, 5))
ax1.hexbin(z[:,0], z[:,1], C=q0_density.log().numpy().squeeze(), cmap='rainbow')
ax1.set_title('$q_0 = \mathcal{N}(\mathbf{0},\mathbb{I})$', fontsize=18);
ax2.hexbin(f_z[:,0], f_z[:,1], C=q1_density.numpy().squeeze(), cmap='rainbow')
ax2.set_title('$q_1=prelu(q_0)$', fontsize=18);

image

I then decided to try to plot the log_probs using your approach, and I get this:

image

Where the shape is really similar to the one above (although they're orders of magnitude apart). But this gives me the feeling some constant is not being added/subtracted to the log_probs given by nflow.final_density.log_prob.

What am I missing you here? I'd really appreciate if you could give this some thought.

Thanks in advance!

An error was encountered while running the VAE model based on the flows.

  • AttributeError Traceback (most recent call last)
  • in
  •   7 block_planar = [PlanarFlow]
    
  •   8 # Create normalizing flow
    
  • ----> 9 flow = NormalizingFlow(dim=n_latent, blocks=block_planar, flow_length=16, density=distrib.MultivariateNormal(torch.zeros(n_latent), torch.eye(n_latent)))
  •  10 # Construct encoder and decoder
    
  •  11 encoder, decoder = construct_encoder_decoder(nin, n_hidden = n_hidden, n_latent = n_latent, n_classes = num_classes)
    
  • in init(self, dim, blocks, flow_length, density)
  •  11         self.bijectors = nn.ModuleList(biject)
    
  •  12         self.base_density = density
    
  • ---> 13 self.final_density = distrib.TransformedDistribution(density, self.transforms)
  •  14         self.log_det = []
    
  •  15 
    
  • D:\LenovoSoftstore\AI_software\Anaconda3\envs\mainpy\lib\site-packages\torch\distributions\transformed_distribution.py in init(self, base_distribution, transforms, validate_args)
  •  57         base_event_dim = len(base_distribution.event_shape)
    
  •  58         transform = ComposeTransform(self.transforms)
    
  • ---> 59 domain_event_dim = transform.domain.event_dim
  •  60         if len(base_shape) < domain_event_dim:
    
  •  61             raise ValueError("base_distribution needs to have shape with size at least {}, but got {}."
    
  • D:\LenovoSoftstore\AI_software\Anaconda3\envs\mainpy\lib\site-packages\torch\distributions\transforms.py in domain(self)
  • 285         if not self.parts:
    
  • 286             return constraints.real
    
  • --> 287 domain = self.parts[0].domain
  • 288         # Adjust event_dim to be maximum among all parts.
    
  • 289         event_dim = self.parts[-1].codomain.event_dim
    
  • D:\LenovoSoftstore\AI_software\Anaconda3\envs\mainpy\lib\site-packages\torch\distributions\transforms.py in domain(self)
  • 285         if not self.parts:
    
  • 286             return constraints.real
    
  • --> 287 domain = self.parts[0].domain
  • 288         # Adjust event_dim to be maximum among all parts.
    
  • 289         event_dim = self.parts[-1].codomain.event_dim
    
  • D:\LenovoSoftstore\AI_software\Anaconda3\envs\mainpy\lib\site-packages\torch\nn\modules\module.py in getattr(self, name)
  • 1128 if name in modules:
  • 1129 return modules[name]
  • -> 1130 raise AttributeError("'{}' object has no attribute '{}'".format(
  • 1131 type(self).name, name))
  • 1132
  • AttributeError: 'PlanarFlow' object has no attribute 'domain'

in affine transforms, wrong order in matmul?

Hey I've been playing with the classes you define in your notebooks, and it seems to me that in the affine transformations you're multiplying weights and z in the wrong order. I changed it in my implementation to: (self.weights @ z.unsqueeze(-1)).squeeze(-1) , and made the appropriate change in the _inverse method too

Loss function missing in first notebook

In flows_01.ipynb, the cell below the text
Now the only missing ingredient is the loss function that is simply defined as follows
is empty. I believe the loss function was defined here.

Can you please share the loss function? Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.