ncullen93 / torchsample Goto Github PK

Train AI models efficiently on medical images using any framework

License: GNU Affero General Public License v3.0

Python 99.35% Shell 0.65%

pytorch deep-learning keras neuroimaging medical-imaging

torchsample's Introduction

Nitrain: a medical imaging-native AI framework

Nitrain (formerly torchsample) is a framework-agnostic python library for sampling and augmenting medical images, training models on medical imaging datasets, and visualizing results in a medical imaging context.

The nitrain library is unique in that it makes training models as simple as possible by providing reasonable defaults and a high-level of abstraction. It also supports multiple frameworks - torch, tensorflow, and keras - with a goal to add even more.

Full examples of training medical imaging AI models using nitrain can be found at the Tutorials page. If you are interested more generally in medical imaging AI, check out Practical medical imaging AI techniques with Python (expected early 2025).

Quickstart

Here is an example of using nitrain to a semantic segmentation model that demonstrates much of the core functionality.

import nitrain as nt
from nitrain.readers import ImageReader, ColumnReader

# create dataset from folder of images + participants file
dataset = nt.Dataset(inputs=ImageReader('sub-*/anat/*_T1w.nii.gz'),
                     outputs=ImageReader('sub-*/anat/*_aparc+aseg.nii.gz'),
                     transforms={
                         'inputs': tx.NormalizeIntensity(0,1),
                         ('inputs', 'outputs'): tx.Resize((64,64,64))
                     },
                     base_dir='~/desktop/ds004711/')

# create loader with random transforms
loader = nt.Loader(dataset,
                   images_per_batch=4,
                   sampler=nt.SliceSampler(batch_size = 32, axis = 2)
                   transforms={
                           'inputs': tx.RandomNoise(sd=0.2)
                   })

# create model from architecture
arch_fn = nt.fetch_architecture('unet', dim=2)
model = arch_fn(input_image_size=(64,64,1),
                mode='segmentation')

# create trainer and fit model
trainer = nt.Trainer(model, task='segmentation')
trainer.fit(loader, epochs=100)

Installation

The latest release of nitrain can be installed from pypi:

pip install nitrain

Or you can install the latest development version directly from github:

python -m pip install git+github.com/nitrain/nitrain.git

Dependencies

The ants python package is a key dependency that allows you to efficiently read, operate on, and visualize medical images. Additionally, you can use keras (tf.keras or keras3), tensorflow, or pytorch as backend for creating your models.

Resources

The following links can be helpful in becoming more familiar with nitrain.

Introduction tutorials [Link]
Segmentation examples [Link]
Classification examples [Link]
Registration examples [Link]
ANTsPy repository [Link]

Contributing

If you have a question or bug report the best way to get help is by posting an issue on the GitHub page. I would be happy to welcome any new contributors or ideas to the project. If you want to add code, the best way to get started is by posting an issue or contacting me at [email protected].

You can support this work by starring the repository or posting a feature request in the issues tab. These actions help increase the project's impact and community reach.

torchsample's People

Contributors

Stargazers

Watchers

Forkers

benjamesbabala wanjinchang qianjide huiyi1990 chagge felicia126 sampathweb recastrodiaz diegslva bzcheeseman xrj-com youngkwonjo ml-ai-nlp-ir bfortuner lfthwjx xypan1232 kjeanclaude rbunn80110 dhaneshr chelovekhe wiibrew kristofe benwu232 esube mctigger willthefrog tivaro tanxchong neosapience richard-chau liulohua mhossny yusuke0519 2prime vfdev-5 jkhlot miguelvr amitibo pingf ddkang ivan33609 johntang93 irustandi mowayao mainak24 insujeon wickstrom redeipirati ppxie antorsae tuxxon junjin8433 atabakd jemgold ivjia grseb9s issamlaradji aradhyamathur artaniscv mmderakhshani achaiah xiaomi2008 jonomon luluama shirc kunwangv zhongminjin lngao ozan-oktay tangal0203 solertis r-b-g-b katrobb chenyangh willdamon blankworld jrieke mponty willprice shijie2016 jdunnmon mehdidc hyzcn yanndubs bluemandora merajat atinghosh indraforyou zhf459 nirvguy hedgefair benoua houchaoqun justiceamoh joshglue luciolis jarrelscy guoshengxu klpek rohitkeshari

torchsample's Issues

`fit_loader` vs. `fit_generator`

I feel that having a fit_loader is a special case of Keras' fit_generator and I found myself in a situation where I was missing the latter.

Is there a reason why fit_loader is not implemented as syntactic sugar for fit_generator and fit_generator is missing completely?

[contributions welcome] add initializer classes and `set_initializers` method

add capability to create initializers like you would regularizers, with regular expressions to filter which layers would get the initializers
e.g:

class Gloriot(Initializer)
    def __init__(self, some_params, module_filter):
        pass
    def __call__(self, module):
        pass

conv_init = Gloriot(module_filter='*conv*')
fc_init = Xavier(module_filter='*fc*')
model.set_initializers([conv_init, fc_init])

cannot import name th_meshgrid

Hi there,

There is an import error for

th_meshgrid

whenever I try to import torchsample. In utils.py I can only find th_meshgrid2d and th_meshgrid3d.

[contributions welcome] Create a `GANTrainer` class to train GANs

this class would abstract away the GAN training loop, while still providing needed flexibility. It would work similar to SuperModule but would be more specific to GANs.

A good proof-of-concept would be to run the pytorch dcgan example would this model.

For instance:

class _netG(nn.Module):
    def __init__(self, ngpu):
        super(_netG, self).__init__()
        self.ngpu = ngpu
        self.main = nn.Sequential(
            # input is Z, going into a convolution
            nn.ConvTranspose2d(     nz, ngf * 8, 4, 1, 0, bias=False),
            nn.BatchNorm2d(ngf * 8),
            nn.ReLU(True),
            # state size. (ngf*8) x 4 x 4
            nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 4),
            nn.ReLU(True),
            # state size. (ngf*4) x 8 x 8
            nn.ConvTranspose2d(ngf * 4, ngf * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 2),
            nn.ReLU(True),
            # state size. (ngf*2) x 16 x 16
            nn.ConvTranspose2d(ngf * 2,     ngf, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf),
            nn.ReLU(True),
            # state size. (ngf) x 32 x 32
            nn.ConvTranspose2d(    ngf,      nc, 4, 2, 1, bias=False),
            nn.Tanh()
            # state size. (nc) x 64 x 64
        )

    def forward(self, input):
        if isinstance(input.data, torch.cuda.FloatTensor) and self.ngpu > 1:
            output = nn.parallel.data_parallel(self.main, input, range(self.ngpu))
        else:
            output = self.main(input)
        return output

class _netD(nn.Module):
    def __init__(self, ngpu):
        super(_netD, self).__init__()
        self.ngpu = ngpu
        self.main = nn.Sequential(
            # input is (nc) x 64 x 64
            nn.Conv2d(nc, ndf, 4, 2, 1, bias=False),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf) x 32 x 32
            nn.Conv2d(ndf, ndf * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 2),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf*2) x 16 x 16
            nn.Conv2d(ndf * 2, ndf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 4),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf*4) x 8 x 8
            nn.Conv2d(ndf * 4, ndf * 8, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 8),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf*8) x 4 x 4
            nn.Conv2d(ndf * 8, 1, 4, 1, 0, bias=False),
            nn.Sigmoid()
        )

    def forward(self, input):
        if isinstance(input.data, torch.cuda.FloatTensor) and self.ngpu > 1:
            output = nn.parallel.data_parallel(self.main, input, range(self.ngpu))
        else:
            output = self.main(input)

        return output.view(-1, 1)

netG = _netG()
netD = _netD()

trainer = GANTrainer(generator=netG, discriminator=netD)
trainer.set_loss(nn.BCELoss())

optimizerD = optim.Adam(netD.parameters(), lr=lr, betas=(beta1, 0.999))
optimizerG = optim.Adam(netG.parameters(), lr=lr, betas=(beta1, 0.999))
trainer.set_optimizers(generator=optimizerD, discriminator=optimizerG)

trainer.fit(...)

ReduceLROnPlateau Patience

The patience argument in the ReduceLROnPlateau callback seems to not do anything.
Indeed, if we see, patience is only used in assigning a class variable, self.patience, which is never used in on_epoch_end.
Also shouldn't https://github.com/ncullen93/torchsample/blob/master/torchsample/callbacks.py#L508
be changed to

elif self.monitor_op(current_loss, self.best_loss):

error while importing torch sample

Traceback (most recent call last):
File "/Users/devansh20la/Documents/Vision lab/Melanoma/ResNet/Exp w:o age/ResNet50_val.py", line 9, in
import torchsample
File "build/bdist.macosx-10.7-x86_64/egg/torchsample/init.py", line 6, in
File "/Users/devansh20la/anaconda2/lib/python2.7/site-packages/torchsample-0.1.3-py2.7.egg/torchsample/datasets.py", line 106
inputs = np.empty((len(load_range), *_parse_shape(input_sample)))
^
SyntaxError: invalid syntax
[Finished in 0.5s with exit code 1]

Python 2 compatibility

Great library, love it!

Some of the tools are not Python 2 compatible. For instance RandomChoiceRotate (and anything that uses util.th_random_choice) does not work because in Python 2 the round method returns a float, not an integer, and the below torch.zeros throws an exception.

idx_vec = th.cat([th.zeros(round(p[i]*1000))+i for i in range(len(p))])

I can make a pull request to try to fix it if you like. I would advise running tests on both Python 2 and 3 compilers.

trainer does not work with FolderDataset

AttributeError: 'FolderDataset' object has no attribute 'num_inputs'
when attempting to call fit_loader using an FolderDataset dataset.

Same problem with the default torch ImageFolder dataset

Tensor Transpose Bug

https://github.com/bfortuner/torchsample/blob/master/torchsample/transforms/tensor_transforms.py#L142

I think this should be torch.transpose() ?

Error when using regularizers, but model does not have layer

I get the following error when my model does not have any convolutional layers, but sub-models do.

torchsample/modules/module_trainer.py", line 595, in fit_loader
batch_logs['regularizer_loss'] = regularizer_loss.data[0]

When I add
self.conv1 = nn.Conv2d(64, 32, kernel_size=3, stride=1, padding=1)

it works. (Though I do not use self.conv1 for calculations).

Are regularizers for nested models even supported? :/

Early stopping does not work

Hello,

It appears as though the EarlyStoppingcallbacks do not work as they should. Having defined the ModuleTraineras below,

learning_rate = 1e-5
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
loss = torch.nn.BCEWithLogitsLoss(size_average=False)
regularizers = [L2Regularizer(scale=1e-5)]
callbacks = [EarlyStopping(monitor='val_loss', patience=5, min_delta=1)]
optimizer.zero_grad()

trainer = ModuleTrainer(model)
trainer.compile(loss=loss, 
                callbacks=callbacks,
                regularizers=regularizers,
                optimizer=optimizer)

the training process does not stop early, even though the validation loss stays the same for far too many epochs (as shown below):

...
Epoch 4983/5000: 19 batches [00:00, 117.97 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4984/5000: 19 batches [00:00, 116.54 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4985/5000: 19 batches [00:00, 110.11 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4986/5000: 19 batches [00:00, 110.94 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4987/5000: 19 batches [00:00, 105.54 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4988/5000: 19 batches [00:00, 118.59 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4989/5000: 19 batches [00:00, 117.49 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4990/5000: 19 batches [00:00, 108.96 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4991/5000: 19 batches [00:00, 111.27 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4992/5000: 19 batches [00:00, 117.38 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4993/5000: 19 batches [00:00, 118.45 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4994/5000: 19 batches [00:00, 115.60 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4995/5000: 19 batches [00:00, 110.75 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4996/5000: 19 batches [00:00, 111.68 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4997/5000: 19 batches [00:00, 115.34 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4998/5000: 19 batches [00:00, 110.62 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 4999/5000: 19 batches [00:00, 109.61 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]
Epoch 5000/5000: 19 batches [00:00, 110.49 batches/s, loss=20.4344, reg_loss=0.0011, val_loss=16.2812]

Any ideas what am I doing wrong, or there is a bug?

P.s. same story if I add callbacks to the model object with model.set_callback.

Train and validation data assumed same size?

Assuming train/validation data sets are 80% and 20% of the whole dataset, respectively.

When I run

trainer.fit(X_train, y_train, 
            val_data=(X_test, y_test),
            num_epoch=20,
            batch_size=128,
            verbose=1)

I get

home/xxx/torchsample/torchsample/modules/module_trainer.py in fit(self, inputs, targets, val_data, num_epoch, batch_size, shuffle, cuda_device, verbose)
    211             num_val_inputs, num_val_targets = _parse_num_inputs_and_targets(val_data[0], val_data[1])
    212             if (num_inputs != num_val_inputs) or (num_targets != num_val_targets):
--> 213                 raise ValueError('num_inputs != num_val_inputs or num_targets != num_val_targets')
    214             val_inputs, val_targets = val_data
    215         has_val_data = val_data is not None

ValueError: num_inputs != num_val_inputs or num_targets != num_val_targets

Train and validation data sets should have the same sizes? O_o

[contributions welcome] Create pytorch-like website for documentation

Could probably just use the pytorch docs code/website. Discussion welcome.

predict() and predict_loader() are not using the GPU

The latest v0.1.3 version is not sending the input(s) and target(s) to GPU() before the forward pass: recastrodiaz@19e8021#diff-f659837420f2b751abd33ab404195ce8R423

Write tests

Need unit tests and integration tests. Keras' tests are probably a good place to start. I like the idea of putting them in a separate tests folder at the top directory outside of the source code (like keras) instead of inside at the sub-module level (like sklearn). I prefer pytest over unittest, since pytest is more lightweight.

Will make a checklist eventually.

Use command [python set_up.py install --user] to install torchsample but Failed

I want to use torchsample to do transforms on my datassets, but i don't have the su access to run [python set_up.py install]. After google it, I run [python set_up.py install --user] to install torchsample. But an error occurs to me, which is [File "build/bdist.linux-x86_64/egg/torchsample/datasets.py", line 106
inputs = np.empty((len(load_range), *_parse_shape(input_sample)))
^
SyntaxError: invalid syntax].
when i finish installing torchsample, I try to use [from torchsample.transforms import Affine] to apply affine to my dataset , but appears [Traceback (most recent call last):
File "", line 1, in
File "build/bdist.linux-x86_64/egg/torchsample/init.py", line 6, in
File "/home/guest1/.local/lib/python2.7/site-packages/torchsample-0.1.3-py2.7.egg/torchsample/datasets.py", line 106
inputs = np.empty((len(load_range), *_parse_shape(input_sample)))
^
SyntaxError: invalid syntax]

I am totally a newbie to pytorch and i don't know how to solve that now TT

TODO: update docstrings

hi all, just fyi i need to update most of the docstrings this weekend.. some may be way wrong as of now. thanks :)

[feature request] ModuleTrainer class

Could potentially add a ModuleTrainer or ModelTrainer class that works similarly to SuperModule but can take in one or more normal nn.Module classes.. This would allow for support of pre-trained networks and more flexible training structures, while also allowing seamless integration with all other pytorch code

@recastrodiaz

Error when some layers are frozen

When I am trying to finetune a pretrained network, I am freezing some layers params using require_grad=False. The optimizer is trying to optimize all params causing
ValueError: optimizing a parameter that doesn't require gradients. Is there are a way to only pass params that have require_grad as True.
Thanks

Need update the notebook example "Transforms with Pytorch and Torchsample"

Hi,

I try to launch several cells of the notebook "Transforms with Pytorch and Torchsample" and it looks like the code is not updated to the latest version.
Here is a list of found problems:

RandomCrop : when I execute the following cell (number 11 in the notebook)

x_example = add_channel(x_train_mnist[0])
print('Before TFORM: ' , x_example.size())
x_crop = rand_crop(x_example)
print('After TFORM: ' , x_crop.size())
plt.imshow(x_crop[0].numpy())
plt.show()

I get the following output:

Before TFORM:  torch.Size([1, 28, 28])

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-10-52967e39be2f> in <module>()
      1 x_example = add_channel(x_train_mnist[0])
      2 print('Before TFORM: ' , x_example.size())
----> 3 x_crop = rand_crop(x_example)
      4 print('After TFORM: ' , x_crop.size())
      5 plt.imshow(x_crop[0].numpy())

/usr/local/lib/python3.5/site-packages/torchsample-0.1.3-py3.5.egg/torchsample/transforms/tensor_transforms.py in __call__(self, *inputs)
    455     def __call__(self, *inputs):
    456         h_idx = random.randint(0,inputs[0].size(1)-self.size[0])
--> 457         w_idx = random.randint(0,inputs[1].size(2)-self.size[1])
    458         outputs = []
    459         for idx, _input in enumerate(inputs):

IndexError: tuple index out of range

Similar error thrown in the example of the complete pipeline

The cell 18 should import new API classes:

# from torchsample.transforms import RandomAdjustGamma, AdjustGamma  
# =>
from torchsample.transforms import RandomGamma, Gamma

Similar stuff with AdjustBrightness and AdjustSaturation in next cells.

I did not tested 3D brain dataset.

HTH

RandomCrop - bug?

First, I would like to thank you for this excellent library!

Should it be 0 instead of 1 in w_idx?

class RandomCrop(object):

def __init__(self, size):
    """
    Randomly crop a torch tensor
    Arguments
    --------
    size : tuple or list
        dimensions of the crop
    """
    self.size = size

def __call__(self, *inputs):
    h_idx = random.randint(0,inputs[0].size(1)-self.size[0])
    w_idx = random.randint(0,inputs[**1**].size(2)-self.size[1])
    outputs = []
    for idx, _input in enumerate(inputs):
        _input = _input[:, h_idx:(h_idx+self.size[0]),w_idx:(w_idx+self.size[1])]
        outputs.append(_input)
    return outputs if idx > 1 else outputs[0]

Correct coordinate shift for centered Rotate operations

There is an error in utils.py function th_affine2d. This affects all Rotate functions (and maybe more).
Can be corrected by:
-> to center coordinates, subtract 0.5 instead of adding 0.5

Correct:

if center:
        # shift the coordinates so center is the origin
        coords[:,:,0] = coords[:,:,0] - (x.size(1) / 2. - 0.5)
        coords[:,:,1] = coords[:,:,1] - (x.size(2) / 2. - 0.5)
    # apply the coordinate transformation
    new_coords = coords.bmm(A_batch.transpose(1,2)) + b_batch.expand_as(coords)

    if center:
        # shift the coordinates back so origin is origin
        new_coords[:,:,0] = new_coords[:,:,0] + (x.size(1) / 2. - 0.5)
        new_coords[:,:,1] = new_coords[:,:,1] + (x.size(2) / 2. - 0.5)

Integrate torchsample with DataLoader

Hi,
This project fills the gap for data augmentation and sampling, however I am a bit confused about how to integrate it with DataLoader, I really liked the fact that multiple workers where taking care of building the batches. If I understand correctly your datasets are complete substitutes to a DataLoader right ?
Thanks

RandomAffine always applies same value

It seems that the RandomAffine function is calling the normal, non-random affine transformation functions instead of the random versions, so the same values are always used.

[feature request] model.summary

add ModuleTrainer.summary like keras

Error when install torchsample

Hi, when i want to install torchsample by "python setup.py install", I meet synteaxError as below:
File "build/bdist.linux-x86_64/egg/torchsample/datasets.py", line 106
inputs = np.empty((len(load_range), *_parse_shape(input_sample)))
^
SyntaxError: invalid syntax

I am not familiar with Python, I guess it is because my python version does not support this syntax or there are some other problems remained under cover....
May be it is a naive problem, but please give me a helping hand.

[contributions welcome] Pretty printing of ModuleTrainer.summary

The ModuleTrainer class has a decent summary() function, but it doesn't print out nicely like Keras.

GPU memory leak

Hi,

I was using this repo for training my networks in pytorch and it has been very helpful and made the code concise. But I was facing out of memory issues while training for long number of iterations or epochs while training on GPUs. When the training starts the usage is around half. As the iterations happen, memory usage is becoming full. Please help.

Thanks

L2 regularization loss is scaled twice.

see regularizers.py#L51

value = th.sum(th.pow(w,2)) * self.scale
loss = self.scale * value

SubsetRandomSampler causes exception on training

If using a loader with SubsetRandomSampler sampling a subset of elements of the dataset, samples will be consumed before the completing an epoch and an exception will be raised.

Ie in fit_loader

len_inputs = len(loader.dataset)

Doesn't account for the fact that a sampler may change the number of elements of a dataset.

How to add rotation augmentation option to data transforms?

from torchvision import  transforms
import torchsample

data_transform = transforms.Compose([
          transforms.Scale(256),
          torchsample.transforms.Rotate(30),
          transforms.CenterCrop(224),
          transforms.ToTensor(),
          transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
      ])

Is this right? Where is the proper position and proper way to use it?

Metrics don't work with multiple outputs

Ditto.

should set training mode explicitly

In module_trainer.py, it is better to set training mode as True in fit and fit_loader, and False in predict and predict_loader.

In current code, when I set a break point in predict_loader(), I found the self.model.training == True. This can impact the prediction performance and even cause memory leak when I use my own noise module (which should only run in training mode).

So I suggest add self.model.train(mode=True) at the first line of those "fit" functions, and self.model.train(mode=False) at the first line of those "predict" functions.

AttributeError: 'CNN' object has no attribute '_has_regularizers'

Hi! Looks like a great library, but I'm getting the following error on python 3.6, newest version of pytorch and this repo:

  File "/home/eelco/PycharmProjects/pt_mnist/train_ps.py", line 51, in <module>
    train(20)
  File "/home/eelco/PycharmProjects/pt_mnist/train_ps.py", line 48, in train
    nb_epoch=n_epochs)
  File "/home/eelco/anaconda3/lib/python3.6/site-packages/torchsample/modules/module_trainer.py", line 444, in fit_loader
    callbacks.on_train_begin()
  File "/home/eelco/anaconda3/lib/python3.6/site-packages/torchsample/callbacks.py", line 67, in on_train_begin
    callback.on_train_begin(logs)
  File "/home/eelco/anaconda3/lib/python3.6/site-packages/torchsample/callbacks.py", line 170, in on_train_begin
    if self.model._has_regularizers:
  File "/home/eelco/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 238, in __getattr__
    type(self).__name__, name))
AttributeError: 'CNN' object has no attribute '_has_regularizers'

My code, where train_loader and test_loader are the same as in the pytorch mnist examples:

    model = CNN((1, 28, 28))
    trainer = ModuleTrainer(model)

    trainer.compile(
        loss='nll_loss',
        optimizer='adam',
    )

    trainer.fit_loader(train_loader,
                       val_loader=test_loader,
                       nb_epoch=10)

Seems the History callback is looking for attributes in the Module that should be in ModuleTrainer?

Edit:
I saw Issue #7 reported the same bug, but when removing the History callback, training gives errors (AttributeError: 'CNN' object has no attribute 'history')

import *

This is not good practice. Please explicitly state the modules being imported.

from torchsample.callbacks import *
from torchsample.regularizers import *
from torchsample.constraints import *
from torchsample.initializers import *
from torchsample.metrics import *

Are there any examples of using the FolderDataset class to load and preprocess image data?

Would appreciate any links.

Support for Multiple Channels in th_affine2d

There seems to be a bug in the affine transform

>>> from torchsample.utils import *
>>> x = torch.zeros(2,1000,1000)
>>> x[:,100:1500,100:500] = 10
>>> matrix = torch.FloatTensor([[1.,0,-50],[0,1.,-50]])
>>> xb = th_affine2d(x, matrix, mode='bilinear')

Throws
RuntimeError: size '[2 x 1000 x 1000]' is invalid for input of with 1000000 elements at /Users/soumith/miniconda2/conda-bld/pytorch_1490983457972/work/torch/lib/TH/THStorage.c:59
It's attempting to view the tensor with the wrong dimensions in both th_nearest_interp2d and th_bilinear_interp2d

`FolderDataset` does not work with `pil_loader`

FolderDataset converts input to Torch tensor with torch.from_numpy, which does not work with PIL.Image and causes RuntimeError: from_numpy expects an np.ndarray but got Image when file_loader is 'pil'.

IMHO, this implicit conversion also made the transform pipeline less efficient in some cases. For example, in the following code snippet, if FolderDataset were to convert PIL.Image to Torch tensor, the user would have to convert it back just to use some image transforms.

import torchsample.transforms as transforms
from torchvision.transforms import Scale, ToTensor, ToPILImage

transforms.Compose([
    ToPILImage(), # XXX this could be avoided
    Scale(256),
    ToTensor(),
    transforms.RandomRotate(20),
    transforms.RandomFlip(),
    transforms.RandomCrop((224, 224)),
])

AttributeError: 'CIFAR10' object has no attribute 'num_inputs'

I get this error when I load CIFAR10 (as described in the pytorch 60 min tutorial) and try to fit a simple ConvNet with trainer.fit_loader(). It looks like fit_loader calls some attribute that is not or no longer part of the dataset object (at least for CIFAR 10).

This used to work, but it stopped working after I updated both pytorch (version 0.2.0.post3
) and torchsample (version 0.1.3) recently.

Here is a simple(ish) example:
https://gist.github.com/pbloem/d370634327cccccf4c56bb6bb7d411f5

epoch and batch size agnostic sampler

Would it be possible to create a sampler that's agnostic to batch size and epoch? Something that seamlessly returns the requested batch size regardless of position. Would help avoiding annoying loops by offloading the problem to the sampler as well as the possibility of injecting noise through random batch size for both the gradient and BN.

epoch, x, y = sampler.next_batch(bsize=128)

Ideally it would maintain a sample buffer of configurable size from which it samples and would shuffle in the background so that at the end of an epoch there isn't a noticeable pause.

I have always wanted to write a sampler like that myself but never had the time, but you said you are taking requests.

A more advanced feature could be ensuring that every class is presented an equal number of times within an epoch or even a batch, subject to more transformations to make up for over-representation perhaps.

[Contributions Welcome] TQDM Progbar doesn't reset after KeyboardInterrupt

If you train a model and then KeyboardInterrupt, then try to train a model again, the TQDM Progress bar acts funny.. I think it's because it's not getting properly closed.

[feature request] Random rotate among discrete angles.

For example, I want to randomly rotate the image counter clockwise, i.e. among [0, 90, 180, 270]. Can this be implemented in this package?
I've quickly implemented one:

class RandomDiscreteRotate(object):

    def __init__(self, 
                 rotation_range,
                 interp='bilinear',
                 lazy=False):
        """
        Randomly rotate an image between degrees in the given list. If the image
        has multiple channels, the same rotation will be applied to each channel.
        Arguments
        ---------
        rotation_range : list
            image will be rotated between degrees given in list
        fill_mode : string in {'constant', 'nearest'}
            how to fill the empty space caused by the transform
        fill_value : float
            the value to fill the empty space with if fill_mode='constant'
        lazy    : boolean
            if false, perform the transform on the tensor and return the tensor
            if true, only create the affine transform matrix and return that
        """
        self.rotation_range = rotation_range
        if not isinstance(interp, (tuple,list)):
            interp = (interp, interp)
        self.interp = interp
        self.lazy = lazy

    def __call__(self, x, y=None):
        k = random.randint(0, len(self.rotation_range)-1)
        degree = self.rotation_range[k]
        
        if self.lazy:
            return Rotate(degree, lazy=True)(x)
        else:
            if y is None:
                x_transformed = Rotate(degree,
                                       interp=self.interp)(x)
                return x_transformed
            else:
                x_transformed, y_transformed = Rotate(degree,
                                                      interp=self.interp)(x,y)
                return x_transformed, y_transformed

Multi-GPU support

Hi,

From reading the code it seems that the ModuleTrainer only supports single GPU processing. It would be extremely useful to be able to support multiple GPUs since a single GPU is not sufficient for anything but toy datasets.

Thanks

[feature request] Brighting/Color/Lighting transformations (data augmentations)?

Is there any plan to add more data augmentations?
Such as color jittering, lighting or brightness in
https://github.com/facebook/fb.resnet.torch/blob/master/datasets/transforms.lua

Metrics only print for training, not validation

Hi,

Love this API and the fact that it's so similar to Keras's! One minor issue I've seen -- metrics are only evaluated for training, and not validation, so you can't see validation top-k accuracy and other metrics. Is this a feature that will be added soon?

Error when using `RandomCrop`

throws IndexError: tuple index out of range, trackback:

IndexError: Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 41, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/usr/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 41, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/usr/lib/python3.6/site-packages/torchvision/datasets/folder.py", line 84, in __getitem__
    img = self.transform(img)
  File "/usr/lib/python3.6/site-packages/torchsample/transforms/tensor_transforms.py", line 31, in __call__
    inputs = transform(*inputs)
  File "/usr/lib/python3.6/site-packages/torchsample/transforms/tensor_transforms.py", line 438, in __call__
    w_idx = random.randint(0,inputs[1].size(2)-self.size[1])
IndexError: tuple index out of range

Data augmentation doesn't seem to work as expected

Thanks for the library!
I am having issues with the data augmentation API. My goal is to use a random affine transformation function to transform both an image and a mask the same way on every call.

In the code below, I am calling an affine transformer function on an image and a mask together. But the output is just the transformed image. Can you point out the correct way of doing this ? Thanks!

from torchsample.transforms import RandomAffine

# Create transformer func
tform = RandomAffine(rotation_range=30, translation_range=0.2, zoom_range=(0.8,1.2))

# Create random image and mask
img = torch.randn(3,100, 100)
mask = torch.randn(1,100,100) 

# The output shows a tensor size of 3x100x100, 
# but I expected a list of tensors belonging to the transformed  'img' and  'mask'
print(tform(img, mask).size())

How to specify loss function?

The trainer module accepts a string for a loss function. Can I pass a function instead?
Otherwise, where can I find the mapping between these strings and actual pytorch loss functions?
For example, how do I use the CrossEntropyLoss?

module_trainer: Make all non-training Variables volatile

Hi, I use Pytorch and your great framework for two days and I observed high VRAM usage while evaluating my models. I think this is caused by not using volatile variables in module_trainer.py for tasks which are not related to training.
When I change Variable(...) to Variable(...,volatile=True) memory consumption is like I'd expect.

Print loss values separately when using multiple losses + add support for loss_weights [Feature Request]

There is currently no way to visualize nor tune losses separately when using a model that has 2 completely different outputs, each with different objective functions.

For instance, the model below would mix all losses together into a single value:

class MultiInputOutputModel(nn.Model):
    def forward(x):
        return f(x), g(x) 

trainer = ModuleTrainer(model)
# This throw an exception: TypeError: 'CategoricalAccuracy' object does not support indexing
# trainer.compile(optimizer='adam', loss=[criterion, mse_loss])
trainer.fit(x_train, (fx_train, gx_train), val_data=(x_val, (fx_val, gx_val)), num_epoch=1)

Epoch 1/1: 104 batches [00:05, 17.93 batches/s, val_loss=1.38e+03, loss=961]

In the model above, f(x) and g(x) have completely different scales. f(x) is an image classifier and g(x) is a regression model that outputs bounding boxes. It would be extremely useful to be able to see every loss & metric separately (like Keras does):

model.compile(Adam(lr=0.001), loss=['mse', 'categorical_crossentropy'], metrics=['accuracy'],
             loss_weights=[.001, 1.])
model.fit(x_train, [fx_train, gx_train], validation_data=(x_val, [fx_val, gx_val]), nb_epoch=1)

3277/3277 [==============================] - 2s - loss: 6.1604 - gx_loss: 5030.2780 -
 fx_loss: 1.1302 - fx_acc: 0.4007 - gx_acc: 0.6710 - val_loss: 4.8844 - val_fx_loss: 4078.7171 -
 val_gx_loss: 0.8057 - val_gx_acc: 0.4500 - val_fx_acc: 0.8400

It would also be super useful to be able to tune each loss separately (losses get combined by a weighted sum):

trainer.compile(optimizer='adam', loss=[criterion, mse_loss], metrics=CategoricalAccuracy(),
 loss_weights=[1., 0.001])

Would anyone else be interested in this feature?

Edit: I have a proof of concept here: https://github.com/ncullen93/torchsample/compare/master...recastrodiaz:losses?expand=1 Still many rough edges, but happy to get feedback on it.