Issue Description

It could be a nice side project to get a handle on PyRo. Not core functionality, but could be useful.

Working demo using VAE and MedNIST

Issue Description

Create a working demo for medical image generation using VAEs.

Issue Description

We might want to implement a convolutional conditional variational autoencoder, something like this, with the following test

class ConvCVAE(CVAEBase):
    """
    Implements an Convolutional
    Conditional Variational Auto-Encoder
    (CVAE) variant architecture. 
    """
    def __init__(self,
                 num_filters:List[int],
                 input_shape:List[int],
                 latent_dim:int,
                 kernel_size:List[int],
                 stride:List[int],
                 padding:List[int],
                 padding_mode:List[str],
                 dilation:Union[List[list], List[int]],
                 num_labels:int,
                 **kwargs):
        """
        :param layer_sizes: List[int], shape of each FCN.
        :param input_size: List[int], input shape to network.
        :param latent_dim: int, latent layer size.
        :param num_labels: int, number of potential classes.
        """
        super(ConvCVAE, self).__init__(**kwargs)
        self.build_network(num_filters,
                           input_shape,
                           latent_dim,
                           kernel_size,
                           stride,
                           padding,
                           padding_mode,
                           dilation,
                           num_labels,
                           **kwargs)
        self.convolutional_encoder = True

    def build_network(self,
                      num_filters,
                      input_shape,
                      latent_dim,
                      kernel_size,
                      stride,
                      padding,
                      padding_mode,
                      dilation,
                      num_labels,
                      **kwargs):
        """
        :param layer_sizes: List[int], shape of each FCN.
        :param input_size: List[int], input shape to network.
        :param latent_dim: int, latent layer size.
        :param num_labels: int, number of potential classes.
        """
        reverse_layer_sizes = copy.deepcopy(num_filters)
        reverse_layer_sizes.reverse()
        if num_filters[0] != input_shape[0]+num_labels:
            # the initial number of filters does not account for
            # the concatenation of the labels to the input
            # so replace the conv channels at the beginning of the
            # network.
            self.encoder = e.ConvVariationalEncoder(num_filters,
                                                [input_shape[0]+num_labels]+input_shape[1:],
                                                latent_dim,
                                                kernel_size,
                                                stride,
                                                padding,
                                                padding_mode,
                                                dilation,
                                                **kwargs)
        else:
            self.encoder = e.ConvVariationalEncoder(num_filters,
                                                input_shape,
                                                latent_dim,
                                                kernel_size,
                                                stride,
                                                padding,
                                                padding_mode,
                                                dilation,
                                                **kwargs)
        self.num_filters = num_filters

        conv_inputs = u.process_convolutional_inputs([kernel_size, stride, padding, dilation])
        kernel_size, stride, padding, dilation = conv_inputs
        reversed_conv_params = u.reverse_lists([kernel_size,
                                                stride,
                                                padding,
                                                dilation])
        # For the decoder, we make sure to concatenate the labels in the FCN
        # as the first input layer into the new encoder.
        self.decoder_bridge = nn.Linear(latent_dim+num_labels, num_filters[-1])
        self.decoder_bridge_1 = nn.Linear(num_filters[-1], np.prod(self.encoder.out_shape))
        self.last_conv = nn.Conv2d(num_filters[0],
                                   input_shape[0],
                                   kernel_size=1,
                                   dilation=1,
                                   padding=0,
                                   padding_mode="zeros",
                                   stride=1)
        out_shape = self.encoder.all_shapes
        new_out = list(reversed(copy.deepcopy(out_shape)))
        out_shape_1 = []
        for i in range(len(new_out)):
            out_shape_1.append([new_out[i][1], new_out[i][2]])
        self.decoder = e.ConvDecoder(reverse_layer_sizes,
                                     input_shape=self.encoder.out_shape,
                                     kernel_size=reversed_conv_params[0],
                                     output_layers=out_shape_1[1:],
                                     stride=reversed_conv_params[1],
                                     padding=reversed_conv_params[2],
                                     padding_mode=padding_mode,
                                     dilation=reversed_conv_params[3],
                                     **kwargs)

def test_ConvCVAE():
    """
    """
    enc = ConvCVAE(num_filters=[6,5,3],
                  input_shape=[1, 50, 50],
                  latent_dim=2,
                  kernel_size=[[1,1],[2,2], [3,3]],
                  stride=[2,2,2],
                  padding=[0,0,0],
                  padding_mode=["zeros", "zeros", "zeros"],
                  dilation=[1,1,1],
                  num_labels=3)
    out = enc.training_step([torch.ones((1,1,50,50)), torch.ones(1,3,50,50)], 0)
    assert list(out.shape) == [1]

However, this is buggily written - the concatenation of the "label" which is in the form of a segmentation like label, will not work at the latent dimension. Alternatively, if we used one-hot class labels, how would we incorporate them into the encoder like the original CVAE implementation?

nmontanabrown / medvae Goto Github PK

medvae's People

Contributors

Watchers

medvae's Issues

Translate into PyRo?

Issue Description

Working demo using VAE and MedNIST

Issue Description

ConvCVAE Class

Issue Description

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent