Coder Social home page Coder Social logo

medvae's People

Contributors

nmontanabrown avatar

Watchers

 avatar  avatar

medvae's Issues

Translate into PyRo?

Issue Description

It could be a nice side project to get a handle on PyRo. Not core functionality, but could be useful.

ConvCVAE Class

Issue Description

We might want to implement a convolutional conditional variational autoencoder, something like this, with the following test

class ConvCVAE(CVAEBase):
    """
    Implements an Convolutional
    Conditional Variational Auto-Encoder
    (CVAE) variant architecture. 
    """
    def __init__(self,
                 num_filters:List[int],
                 input_shape:List[int],
                 latent_dim:int,
                 kernel_size:List[int],
                 stride:List[int],
                 padding:List[int],
                 padding_mode:List[str],
                 dilation:Union[List[list], List[int]],
                 num_labels:int,
                 **kwargs):
        """
        :param layer_sizes: List[int], shape of each FCN.
        :param input_size: List[int], input shape to network.
        :param latent_dim: int, latent layer size.
        :param num_labels: int, number of potential classes.
        """
        super(ConvCVAE, self).__init__(**kwargs)
        self.build_network(num_filters,
                           input_shape,
                           latent_dim,
                           kernel_size,
                           stride,
                           padding,
                           padding_mode,
                           dilation,
                           num_labels,
                           **kwargs)
        self.convolutional_encoder = True

    def build_network(self,
                      num_filters,
                      input_shape,
                      latent_dim,
                      kernel_size,
                      stride,
                      padding,
                      padding_mode,
                      dilation,
                      num_labels,
                      **kwargs):
        """
        :param layer_sizes: List[int], shape of each FCN.
        :param input_size: List[int], input shape to network.
        :param latent_dim: int, latent layer size.
        :param num_labels: int, number of potential classes.
        """
        reverse_layer_sizes = copy.deepcopy(num_filters)
        reverse_layer_sizes.reverse()
        if num_filters[0] != input_shape[0]+num_labels:
            # the initial number of filters does not account for
            # the concatenation of the labels to the input
            # so replace the conv channels at the beginning of the
            # network.
            self.encoder = e.ConvVariationalEncoder(num_filters,
                                                [input_shape[0]+num_labels]+input_shape[1:],
                                                latent_dim,
                                                kernel_size,
                                                stride,
                                                padding,
                                                padding_mode,
                                                dilation,
                                                **kwargs)
        else:
            self.encoder = e.ConvVariationalEncoder(num_filters,
                                                input_shape,
                                                latent_dim,
                                                kernel_size,
                                                stride,
                                                padding,
                                                padding_mode,
                                                dilation,
                                                **kwargs)
        self.num_filters = num_filters

        conv_inputs = u.process_convolutional_inputs([kernel_size, stride, padding, dilation])
        kernel_size, stride, padding, dilation = conv_inputs
        reversed_conv_params = u.reverse_lists([kernel_size,
                                                stride,
                                                padding,
                                                dilation])
        # For the decoder, we make sure to concatenate the labels in the FCN
        # as the first input layer into the new encoder.
        self.decoder_bridge = nn.Linear(latent_dim+num_labels, num_filters[-1])
        self.decoder_bridge_1 = nn.Linear(num_filters[-1], np.prod(self.encoder.out_shape))
        self.last_conv = nn.Conv2d(num_filters[0],
                                   input_shape[0],
                                   kernel_size=1,
                                   dilation=1,
                                   padding=0,
                                   padding_mode="zeros",
                                   stride=1)
        out_shape = self.encoder.all_shapes
        new_out = list(reversed(copy.deepcopy(out_shape)))
        out_shape_1 = []
        for i in range(len(new_out)):
            out_shape_1.append([new_out[i][1], new_out[i][2]])
        self.decoder = e.ConvDecoder(reverse_layer_sizes,
                                     input_shape=self.encoder.out_shape,
                                     kernel_size=reversed_conv_params[0],
                                     output_layers=out_shape_1[1:],
                                     stride=reversed_conv_params[1],
                                     padding=reversed_conv_params[2],
                                     padding_mode=padding_mode,
                                     dilation=reversed_conv_params[3],
                                     **kwargs)
def test_ConvCVAE():
    """
    """
    enc = ConvCVAE(num_filters=[6,5,3],
                  input_shape=[1, 50, 50],
                  latent_dim=2,
                  kernel_size=[[1,1],[2,2], [3,3]],
                  stride=[2,2,2],
                  padding=[0,0,0],
                  padding_mode=["zeros", "zeros", "zeros"],
                  dilation=[1,1,1],
                  num_labels=3)
    out = enc.training_step([torch.ones((1,1,50,50)), torch.ones(1,3,50,50)], 0)
    assert list(out.shape) == [1]

However, this is buggily written - the concatenation of the "label" which is in the form of a segmentation like label, will not work at the latent dimension. Alternatively, if we used one-hot class labels, how would we incorporate them into the encoder like the original CVAE implementation?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.