interdigitalinc / compressai Goto Github PK

View Code? Open in Web Editor NEW

1.1K 29.0 225.0 40.15 MB

A PyTorch library and evaluation platform for end-to-end compression research

Home Page: https://interdigitalinc.github.io/CompressAI/

License: BSD 3-Clause Clear License

Python 89.22% C++ 9.56% Makefile 0.23% Dockerfile 0.07% CMake 0.24% Shell 0.69%

compression deep-learning python pytorch machine-learning deep-neural-networks neural-network

compressai's People

Contributors

$fracape avatar$

Stargazers

Watchers

Forkers

santolina harleyzhang tuzm24 jianpinglin austinxw sier-git back2yes caoscott sumerzhang tourbillons wonlee2019 jingyuying unicornhope vreis ischiopu silk760 wxz1996 sailfish009 rgurve xiamenwcy adamlu123 0f0f muralits98 ipersevere ecchui dz6sqy1998 nipi64310 tolgaok scape1989 zeta1999 xrosliang zhaozanzzz 1035326373 jacklikesironman zhouleisjtu cv-ip qiusuor ywu40 chen8023 kovakimy baoyu2020 chrisa142857 blakecheng binzzheng katerega pankajvshrma navid-mahmoudian micmic123 wemozj clamscorpio ndenstanford edmontdants felipecode tallamjr youngyoonlee emahoor bolimath pascalbacchus mauriceqch lidq92 zhenglyufelix ezgimez wangbiaouestc biaze7 zouxz fengzigai yocurryc trendingtechnology kktsubota jordan-benjamin zhengxinchenee daydreamer2023 lbhm yoon6503 zhangyuef kaname-madoka18 shengjie-chen jianzhangcs yuhongjiu arezkibouzid guohf3 jareturing yodaembedding eedavidwu jasonlsc jingyi-shen dandingbudanding pvsri27 1987566643 piby612 immersivetsdf ws-syx wyf0912 harviu youlenda hawksun562 shao15xiang lengmm nam-nguyen-hoang hawkeyedesi

compressai's Issues

about bpp

sorry to bother you again. I do some experiment in my own task. I find if i constrain the value of y and yhat to [-1,1] ，the bpp will be very small like 0.0037 . Wouldn't it be weird to be so small？

Is there any requirement about the resolution of input image?

When I tried to compress an image of 400*500 resolution using pre-traind bmshj2018_hyperprior model, I got such error:
torch.Size([1, 3, 500, 400])

Traceback (most recent call last):
File "compress_scale_img_val.py", line 139, in
out_net = net.forward(x)
File "/home/super/CompressAI/compressai/models/priors.py", line 263, in forward
y_hat, y_likelihoods = self.gaussian_conditional(y, scales_hat)
File "/home/super/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/super/CompressAI/compressai/entropy_models/entropy_models.py", line 576, in forward
likelihood = self._likelihood(outputs, scales, means)
File "/home/super/CompressAI/compressai/entropy_models/entropy_models.py", line 565, in _likelihood
upper = self._standardized_cumulative((half - values) / scales)
RuntimeError: The size of tensor a (25) must match the size of tensor b (28) at non-singleton dimension 3

Unable to evaluate my own model

Thank you for your impressive work. I have a question when training my own model. I simply test the training codes:
python3 examples/train.py -d ./DIV2K/ --epochs 5 -lr 1e-4 --batch-size 16 --cuda --save

Then I evaluate my result by running:
python -m compressai.utils.eval_model checkpoint ./DIV2K/test/ -a bmshj2018-factorized -p checkpoint_best_loss.pth.tar

It returns the following error:
Traceback (most recent call last): File "/home/jifan/anaconda3/envs/CAI/lib/python3.6/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/home/jifan/anaconda3/envs/CAI/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/jifan/CAI/CompressAI/compressai/utils/eval_model/__main__.py", line 309, in <module> main(sys.argv[1:]) File "/home/jifan/CAI/CompressAI/compressai/utils/eval_model/__main__.py", line 285, in main model = load_func(*opts, run) File "/home/jifan/CAI/CompressAI/compressai/utils/eval_model/__main__.py", line 149, in load_checkpoint return architectures[arch].from_state_dict(state_dict).eval() File "/home/jifan/CAI/CompressAI/compressai/models/priors.py", line 161, in from_state_dict N = state_dict["g_a.0.weight"].size(0) KeyError: 'g_a.0.weight'

Python: 3.6.13
CompressAI: 1.1.5.dev0
Thanks!

Error in loading a stored checkpoint

Hello,

When I load a stored checkpoint, I get the following error:

RuntimeError: output with shape [128, 3, 1] doesn't match the broadcast shape [128, 3, 3]

If I am reading a state_dict correctly, then I think there is probably a bug in you load_state_dict. For your convenience I slightly modified your CompressAI/examples/train.py example to accept also a checkpoint as input to continue from a previously stored checkpoint. For that, you just need to run the following code twice (I used bmshj2018-hyperprior model):

Once without --checkpoint-file for 1-2 epochs just to save a checkpoint
Then run the script with --checkpoint-file [/address/of/stored/checkpoint].

# Copyright 2020 InterDigital Communications, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import argparse
import math
import random
import shutil
import sys

import torch
import torch.nn as nn
import torch.optim as optim

from torch.utils.data import DataLoader
from torchvision import transforms

from compressai.datasets import ImageFolder
from compressai.zoo import models


class RateDistortionLoss(nn.Module):
    """Custom rate distortion loss with a Lagrangian parameter."""

    def __init__(self, lmbda=1e-2):
        super().__init__()
        self.mse = nn.MSELoss()
        self.lmbda = lmbda

    def forward(self, output, target):
        N, _, H, W = target.size()
        out = {}
        num_pixels = N * H * W

        out["bpp_loss"] = sum(
            (torch.log(likelihoods).sum() / (-math.log(2) * num_pixels))
            for likelihoods in output["likelihoods"].values()
        )
        out["mse_loss"] = self.mse(output["x_hat"], target)
        out["loss"] = self.lmbda * 255 ** 2 * out["mse_loss"] + out["bpp_loss"]

        return out


class AverageMeter:
    """Compute running average."""

    def __init__(self):
        self.val = 0
        self.avg = 0
        self.sum = 0
        self.count = 0

    def update(self, val, n=1):
        self.val = val
        self.sum += val * n
        self.count += n
        self.avg = self.sum / self.count


class CustomDataParallel(nn.DataParallel):
    """Custom DataParallel to access the module methods."""

    def __getattr__(self, key):
        try:
            return super().__getattr__(key)
        except AttributeError:
            return getattr(self.module, key)


def configure_optimizers(net, args):
    """Separate parameters for the main optimizer and the auxiliary optimizer.
    Return two optimizers"""

    parameters = set(
        p for n, p in net.named_parameters() if not n.endswith(".quantiles")
    )
    aux_parameters = set(
        p for n, p in net.named_parameters() if n.endswith(".quantiles")
    )

    # Make sure we don't have an intersection of parameters
    params_dict = dict(net.named_parameters())
    inter_params = parameters & aux_parameters
    union_params = parameters | aux_parameters

    assert len(inter_params) == 0
    assert len(union_params) - len(params_dict.keys()) == 0

    optimizer = optim.Adam(
        (p for p in parameters if p.requires_grad),
        lr=args.learning_rate,
    )
    aux_optimizer = optim.Adam(
        (p for p in aux_parameters if p.requires_grad),
        lr=args.aux_learning_rate,
    )
    return optimizer, aux_optimizer


def train_one_epoch(
    model, criterion, train_dataloader, optimizer, aux_optimizer, epoch, clip_max_norm
):
    model.train()
    device = next(model.parameters()).device

    for i, d in enumerate(train_dataloader):
        d = d.to(device)

        optimizer.zero_grad()
        aux_optimizer.zero_grad()

        out_net = model(d)

        out_criterion = criterion(out_net, d)
        out_criterion["loss"].backward()
        if clip_max_norm > 0:
            torch.nn.utils.clip_grad_norm_(model.parameters(), clip_max_norm)
        optimizer.step()

        aux_loss = model.aux_loss()
        aux_loss.backward()
        aux_optimizer.step()

        if i % 10 == 0:
            print(
                f"Train epoch {epoch}: ["
                f"{i*len(d)}/{len(train_dataloader.dataset)}"
                f" ({100. * i / len(train_dataloader):.0f}%)]"
                f'\tLoss: {out_criterion["loss"].item():.3f} |'
                f'\tMSE loss: {out_criterion["mse_loss"].item():.3f} |'
                f'\tBpp loss: {out_criterion["bpp_loss"].item():.2f} |'
                f"\tAux loss: {aux_loss.item():.2f}"
            )


def test_epoch(epoch, test_dataloader, model, criterion):
    model.eval()
    device = next(model.parameters()).device

    loss = AverageMeter()
    bpp_loss = AverageMeter()
    mse_loss = AverageMeter()
    aux_loss = AverageMeter()

    with torch.no_grad():
        for d in test_dataloader:
            d = d.to(device)
            out_net = model(d)
            out_criterion = criterion(out_net, d)

            aux_loss.update(model.aux_loss())
            bpp_loss.update(out_criterion["bpp_loss"])
            loss.update(out_criterion["loss"])
            mse_loss.update(out_criterion["mse_loss"])

    print(
        f"Test epoch {epoch}: Average losses:"
        f"\tLoss: {loss.avg:.3f} |"
        f"\tMSE loss: {mse_loss.avg:.3f} |"
        f"\tBpp loss: {bpp_loss.avg:.2f} |"
        f"\tAux loss: {aux_loss.avg:.2f}\n"
    )

    return loss.avg


def save_checkpoint(state, is_best, filename="checkpoint.pth.tar"):
    torch.save(state, filename)
    if is_best:
        shutil.copyfile(filename, "checkpoint_best_loss.pth.tar")


def parse_args(argv):
    parser = argparse.ArgumentParser(description="Example training script.")
    parser.add_argument(
        "-m",
        "--model",
        default="bmshj2018-factorized",
        choices=models.keys(),
        help="Model architecture (default: %(default)s)",
    )
    parser.add_argument(
        "-d", "--dataset", type=str, required=True, help="Training dataset"
    )
    parser.add_argument(
        "-e",
        "--epochs",
        default=100,
        type=int,
        help="Number of epochs (default: %(default)s)",
    )
    parser.add_argument(
        "-lr",
        "--learning-rate",
        default=1e-4,
        type=float,
        help="Learning rate (default: %(default)s)",
    )
    parser.add_argument(
        "-n",
        "--num-workers",
        type=int,
        default=30,
        help="Dataloaders threads (default: %(default)s)",
    )
    parser.add_argument(
        "--lambda",
        dest="lmbda",
        type=float,
        default=1e-2,
        help="Bit-rate distortion parameter (default: %(default)s)",
    )
    parser.add_argument(
        "--batch-size", type=int, default=16, help="Batch size (default: %(default)s)"
    )
    parser.add_argument(
        "--test-batch-size",
        type=int,
        default=64,
        help="Test batch size (default: %(default)s)",
    )
    parser.add_argument(
        "--aux-learning-rate",
        default=1e-3,
        help="Auxiliary loss learning rate (default: %(default)s)",
    )
    parser.add_argument(
        "--patch-size",
        type=int,
        nargs=2,
        default=(256, 256),
        help="Size of the patches to be cropped (default: %(default)s)",
    )
    parser.add_argument("--cuda", action="store_true", help="Use cuda")
    parser.add_argument("--save", action="store_true", help="Save model to disk")
    parser.add_argument(
        "--seed", type=float, help="Set random seed for reproducibility"
    )
    parser.add_argument(
        "--clip_max_norm",
        default=1.0,
        type=float,
        help="gradient clipping max norm (default: %(default)s",
    )
    parser.add_argument('--checkpoint-file', type=str, help='File address to resume training from the previous saved checkpoint')
    args = parser.parse_args(argv)
    return args


def main(argv):
    args = parse_args(argv)

    if args.seed is not None:
        torch.manual_seed(args.seed)
        random.seed(args.seed)

    train_transforms = transforms.Compose(
        [transforms.RandomCrop(args.patch_size), transforms.ToTensor()]
    )

    test_transforms = transforms.Compose(
        [transforms.CenterCrop(args.patch_size), transforms.ToTensor()]
    )

    train_dataset = ImageFolder(args.dataset, split="train", transform=train_transforms)
    test_dataset = ImageFolder(args.dataset, split="test", transform=test_transforms)

    train_dataloader = DataLoader(
        train_dataset,
        batch_size=args.batch_size,
        num_workers=args.num_workers,
        shuffle=True,
        pin_memory=True,
    )

    test_dataloader = DataLoader(
        test_dataset,
        batch_size=args.test_batch_size,
        num_workers=args.num_workers,
        shuffle=False,
        pin_memory=True,
    )

    device = "cuda" if args.cuda and torch.cuda.is_available() else "cpu"

    net = models[args.model](quality=3)
    net = net.to(device)

    if args.cuda and torch.cuda.device_count() > 1:
        net = CustomDataParallel(net)

    optimizer, aux_optimizer = configure_optimizers(net, args)
    lr_scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer, "min")
    criterion = RateDistortionLoss(lmbda=args.lmbda)

    last_epoch = -1
    if args.checkpoint_file:  # load from previous checkpoint
        print("Loading", args.checkpoint_file)
        checkpoint = torch.load(args.checkpoint_file, map_location=device)
        last_epoch = checkpoint["epoch"]
        net.load_state_dict((checkpoint["state_dict"]))
        net.update(force=True)  # update the model CDFs parameters.
        optimizer.load_state_dict((checkpoint["optimizer"]))
        aux_optimizer.load_state_dict((checkpoint["aux_optimizer"]))
        lr_scheduler.load_state_dict((checkpoint["lr_scheduler"]))

    best_loss = 1e10
    for epoch in range(last_epoch + 1, args.epochs):
        print(f"Learning rate: {optimizer.param_groups[0]['lr']}")
        train_one_epoch(
            net,
            criterion,
            train_dataloader,
            optimizer,
            aux_optimizer,
            epoch,
            args.clip_max_norm,
        )

        loss = test_epoch(epoch, test_dataloader, net, criterion)
        lr_scheduler.step(loss)

        is_best = loss < best_loss
        best_loss = min(loss, best_loss)
        if args.save:
            save_checkpoint(
                {
                    "epoch": epoch,
                    "state_dict": net.state_dict(),
                    "loss": loss,
                    "optimizer": optimizer.state_dict(),
                    "aux_optimizer": aux_optimizer.state_dict(),
                    "lr_scheduler": lr_scheduler.state_dict(),
                },
                is_best,
            )





if __name__ == "__main__":
    main(sys.argv[1:])

EntropyBottleneck parameters not optimized as they should

Hi,
The main optimizer gets its parameters through net.parameters(), which is overridden by a function defined in priors.py under the CompressionModel class. This function returns all the model's parameters except for the parameters which belong to EntropyBottleneck class. The latter, which are given by net.aux_parameters(), are optimized by the auxiliary optimizer.
optimizer = optim.Adam(net.parameters(), lr=args.learning_rate)
aux_optimizer = optim.Adam(net.aux_parameters(), lr=args.aux_learning_rate)
The problem is, that EntropyBottelneck contains also the important parameters _biases, _factors, _matrices that are responsible for estimating the pdf and should be optimized during training by the main optimizer given the main loss. In addition, EntropyBottelneck contains the auxiliary parameter 'quantiles' that is relevant only for the entropy coding itself which is done later. The quantiles should indeed be optimized by the auxiliary optimizer given the auxiliary loss.
In the current implementation, if I understand our code correctly, I think that the _biases, _factors, _matrices parameters "fall between the chairs" and are in fact not optimized by neither optimizer, becuase the 'aux loss' is irrelevant for them. Given this observation, it is not clear to me how you managed to get those nice training results.

Thanks,
Danny

…

KeyError: 'g_a.0.weight' in Model update

Hi all,

I am trying to finetune a pre-trained model on my dataset. The packages version I used are listed as follows:

PyTorch: Version: 1.8.1+cu111   
CompressAI: Version: 1.1.4

I mainly modified one line in examples/train.py and changed it to finetune.py. The line I modified changed to the following:

net = models[args.model](quality=args.quality, metric="mse", pretrained=True)

And I used the following script to finetune the pretrained model and save it to a checkpoint.

python finetune.py -m cheng2020-anchor -q 1 -d ../../test-pictures-finetuned --save --cuda

And it successfully generates the checkpoints I need. Then, I follow the steps provided in the tutorial. To update the CDF, and export the checkpoint.
I did it with the following command.

python -m compressai.utils.update_model --architecture mean-scale-hyperprior -n cheng2020-anchor-finetuned -d ../updated-checkpoints checkpoint_best_loss.pth.tar

And the error occurred:

Traceback (most recent call last):                                                                                                                                                       
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main                                                                                                                   
    return _run_code(code, main_globals, None,                                                                                                                                           
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code                                                                                                                              
    exec(code, run_globals)                                                                                                                                                              
  File "/usr/local/lib/python3.8/dist-packages/compressai/utils/update_model/__main__.py", line 139, in <module>                                                                         
    main(sys.argv[1:])                                                                                                                                                                   
  File "/usr/local/lib/python3.8/dist-packages/compressai/utils/update_model/__main__.py", line 110, in main                                                                             
    net = model_cls.from_state_dict(state_dict)                                                                                                                                          
  File "/usr/local/lib/python3.8/dist-packages/compressai/models/priors.py", line 274, in from_state_dict                                                                                
    N = state_dict["g_a.0.weight"].size(0)                                                                                                                                               
KeyError: 'g_a.0.weight'

I changed the --architecture to all other architectures, {factorized-prior,jarhp,mean-scale-hyperprior,scale-hyperprior} . None of these works. Could you help me with this?
Thanks a lot!

evalutation error on my own dadaset

Hi , I trained a model on my own dataset, i met such error when i evalutate the model ,what's wrong about it?

julie@loaclhost compressai % python -m compressai.utils.eval_model checkpoint /Users/julie/compressai/data/test/ -a bmshj2018-factorized -p checkpoint_best_loss.pth.tar
Traceback (most recent call last):
File "/Users/julie/opt/anaconda3/envs/torch/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/Users/julie/opt/anaconda3/envs/torch/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/Users/julie/compressai/compressai/utils/eval_model/main.py", line 286, in
main(sys.argv[1:])
File "/Users/julie/compressai/compressai/utils/eval_model/main.py", line 264, in main
model = load_func(*opts, run)
File "/Users/julie/compressai/compressai/utils/eval_model/main.py", line 145, in load_checkpoint
return architectures[arch].from_state_dict(torch.load(checkpoint_path)).eval()
File "/Users/julie/compressai/compressai/models/priors.py", line 184, in from_state_dict
N = state_dict["g_a.0.weight"].size(0)
KeyError: 'g_a.0.weight'

visualization results on the cifar10 dataset are very poor

Thanks for your great work! I trained a simple Auto Encoder with cifar10, and my loss dropped normally during training. After training, I updated the parameters of the entropy model. When I want to visualize my results, if there is no model.eval(), the reconstructed picture is normal, but if model.eval() is used, it will be very bad. Looking forward to your reply.

About lambda

Sorry to bother you again. I have another question. I want to know what the value of lambda in loss function depends on. If the picture I use is 112x112, do I need to adjust the lambda in config, or do I need to search for the lambda value

GMM based entropy model

Hi, thanks for nice works!
I have some questions about GMM based entropy model in the framework of CompressAI.

Do you have any plans to implement GMM based entropy model?
If it's not a matter, could you give me some hints to implement it myself?

I have implemented GMM based entropy model for training, but not for test (real encoding/decoding).
That is, I successfully modified entropy_parameter module in Cheng2020 model and _likelihood() of GaussianConditional, but I have no idea how to modify update() of GaussianConditional.
How should I change update function for real compression?

Thanks.

Learning rate adjustments during train

Hi,
Thanks for your great project!

I am trying to figure out the exact learning rate adjustment regime that you are using to reproduce the training of bmshj2018, mbt2018 etc. I couldn't found any indication in the code (e.g. in train.py), however you did mention in the documentation :

Models were trained with a batch size of 16 or 32, and an initial learning rate of 1e-4 for approximately 1-2M steps. The learning rate is then divided by 2 when the evaluation loss reaches a plateau (we use a patience of 20 epochs).

From this I infer that you're probably using pytorch's ReduceLROnPlateau with patience 20 and factor 0.5.
My questions are:

Did you apply ReduceLROnPlateau only on the optimizer of the main loss, or also for the auxiliary loss that is being optimized during training as well?
If ReduceLROnPlateau is applied also on the auxiliary loss (that is initialized with LR of 1e-3), then what are its parameters? Are they identical to the main loss?
What about the other parameters for ReduceLROnPlateau (e.g. threshold, cooldown, min_lr, etc.)? If there are other non-default values that you used, could you please specify?

Another question on a slightly different topic: When training for MS-SSIM metric, have you trained the models from scratch, or finetuned from pre-trained models optimized for MSE (some papers mention they are using the latter option to save time).

All the best,
Danny

Differences of reconstruction results between forward() and compress()

I found that there were differences of reconstructed images between forwarding and real compression when using pretrained mbt2018 model provided by the library.
For example, when I forwarded kodak 19 image to pretrained mbt2018 model of quality 1, I got the result of following (rounded):

Bit-rate: 0.0905 bpp
PSNR: 28.0561dB
MS-SSIM: 0.9084

However, when I really compressed it, the reconstructed image was different from the above.
Please notice psnr and ms-ssim values.

$ python3 -m compressai.utils.eval_model pretrained -a mbt2018 -q 1 -m mse ./kodak_19/

{
  "name": "mbt2018",
  "description": "Inference (ans)",
  "results": {
    "psnr": [
      28.058247955168817
    ],
    "ms-ssim": [
      0.9069046378135681
    ],
    "bpp": [
      0.091796875
    ],
    "encoding_time": [
      5.185348987579346
    ],
    "decoding_time": [
      10.088664054870605
    ]
  }
}

Other models like mbt2018_mean was fine.
Is it a bug?
Thanks.

About Pretrained models of Cheng2020Anchor at high bit rates

The pretrained models of Cheng2020Anchor do not cover bit rates higher than 1 bpp (only 1-6 models). I try to train Cheng2020Anchor at high bit rates by increasing lambda or number of channels, but find the performance degrades severely compared with other methods. Do the developers find the same problem? Is it possible to include Cheng2020Anchor at higher bit rates (>1bpp) like other methods in the pretrained model?

I appreciate if the developers can give me some advice.

object has no attribute 'aux_parameters'

Hello,

It seems that after the recent update, I get the following error in my code

torch.nn.modules.module.ModuleAttributeError: 'FactorizedPrior' object has no attribute 'aux_parameters'

To double check, I also executed your examples/CompressAI Inference Demo.ipynb example and I got the same error. Has anything change in the structure of the code?

Thank you again for this nice library

Support DistributedDataParallel and DataParallel, and publish Python package

First of all, thank you for the great package!

1. Support DistributedDataParallel and DataParallel

I'm working on large-scale experiments that takes pretty long for training, and wondering if this framework can support DataParallel and DistributedDataParallel.

The current example/train.py looks like supporting Dataparallel as CustomDataParallel, but returned the following error

Traceback (most recent call last):
  File "examples/train.py", line 369, in <module>
    main(sys.argv[1:])
  File "examples/train.py", line 348, in main
    args.clip_max_norm,
  File "examples/train.py", line 159, in train_one_epoch
    out_net = model(d)
  File "/home/yoshitom/.local/share/virtualenvs/yoshitom-lJAkl1qx/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/yoshitom/.local/share/virtualenvs/yoshitom-lJAkl1qx/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 160, in forward
    replicas = self.replicate(self.module, self.device_ids[:len(inputs)])
  File "/home/yoshitom/.local/share/virtualenvs/yoshitom-lJAkl1qx/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 165, in replicate
    return replicate(module, device_ids, not torch.is_grad_enabled())
  File "/home/yoshitom/.local/share/virtualenvs/yoshitom-lJAkl1qx/lib/python3.6/site-packages/torch/nn/parallel/replicate.py", line 140, in replicate
    param_idx = param_indices[param]
KeyError: Parameter containing:
tensor([[[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]],

        [[-10.,   0.,  10.]]], device='cuda:0', requires_grad=True)

(pipenv run python examples/train.py --data ./dataset/ --batch-size 4 --cuda on a machine with 3 GPUs)

When commenting out these two lines https://github.com/InterDigitalInc/CompressAI/blob/master/examples/train.py#L333-L334 , it looks working well

/home/yoshitom/.local/share/virtualenvs/yoshitom-lJAkl1qx/lib/python3.6/site-packages/torch/nn/modules/container.py:435: UserWarning: Setting attributes on ParameterList is not supported.
  warnings.warn("Setting attributes on ParameterList is not supported.")
Train epoch 0: [0/5000 (0%)]	Loss: 183.278 |	MSE loss: 0.278 |	Bpp loss: 2.70 |	Aux loss: 5276.71
Train epoch 0: [40/5000 (1%)]	Loss: 65.175 |	MSE loss: 0.096 |	Bpp loss: 2.70 |	Aux loss: 5273.95
Train epoch 0: [80/5000 (2%)]	Loss: 35.178 |	MSE loss: 0.050 |	Bpp loss: 2.69 |	Aux loss: 5271.21
Train epoch 0: [120/5000 (2%)]	Loss: 36.634 |	MSE loss: 0.052 |	Bpp loss: 2.68 |	Aux loss: 5268.45
Train epoch 0: [160/5000 (3%)]	Loss: 26.010 |	MSE loss: 0.036 |	Bpp loss: 2.68 |	Aux loss: 5265.67
...

Could you please fix the issue and also support DistributedDataParallel?
If you need more examples to identify the components causing this issue, let me know. I have a few more examples (error messages) for both DataParallel and DistributedDataParallel with different network architectures (containing CompressionModel).

2. Publish Python package

It would be much more useful if you can publish this framework as a Python package so that we can install it with pip install compressai

Thank you!

A question about using jpeg2000

Hello, I'm sorry to bother you, I have a question when I use this project about jpeg2000. Should I download ffmpeg and openjpeg by myself? Thanks.

Weird behavior of pretrained models with CUDA

Python: 3.6.9
CUDA: 11.2
compressai: 1.1.3
torch: 1.8.1

When replicating the RD curves in README for kodak dataset (wget http://r0k.us/graphics/kodak/kodak/kodim{0,1,2}{0,1,2,3,4,5,6,7,8,9}.png in ./kodak/), I observed a weird behavior of pretrained models with CUDA.

Looks like working well without CUDA

python -m compressai.utils.eval_model pretrained ./kodak/ -a bmshj2018-hyperprior --metric mse --quality 5

metric = mse

Downloading: "https://compressai.s3.amazonaws.com/models/v1/bmshj2018-hyperprior-5-f8b614e1.pth.tar" to /home/yoshitom/.cache/torch/hub/checkpoints/bmshj2018-hyperprior-5-f8b614e1.pth.tar
100.0%
{
  "name": "bmshj2018-hyperprior",
  "description": "Inference (ans)",
  "results": {
    "psnr": [
      34.52624269077672
    ],
    "ms-ssim": [
      0.9835608204205831
    ],
    "bpp": [
      0.6686842176649305
    ],
    "encoding_time": [
      0.2404747505982717
    ],
    "decoding_time": [
      0.5095066924889883
    ]
  }
}

metric = ms-ssim

python -m compressai.utils.eval_model pretrained ./kodak/ -a bmshj2018-hyperprior --metric ms-ssim --quality 5

Downloading: "https://compressai.s3.amazonaws.com/models/v1/bmshj2018-hyperprior-ms-ssim-5-c34afc8d.pth.tar" to /home/yoshitom/.cache/torch/hub/checkpoints/bmshj2018-hyperprior-ms-ssim-5-c34afc8d.pth.tar
100.0%
{
  "name": "bmshj2018-hyperprior",
  "description": "Inference (ans)",
  "results": {
    "psnr": [
      28.992422918554542
    ],
    "ms-ssim": [
      0.9866020356615385
    ],
    "bpp": [
      0.47353786892361116
    ],
    "encoding_time": [
      0.24171670277913412
    ],
    "decoding_time": [
      0.5283569494883219
    ]
  }
}

PSNR and MS-SSIM are both NaN when using CUDA

python -m compressai.utils.eval_model pretrained ./kodak/ -a bmshj2018-hyperprior --metric mse --quality 5 --cuda

metric = mse

Downloading: "https://compressai.s3.amazonaws.com/models/v1/bmshj2018-hyperprior-5-f8b614e1.pth.tar" to /home/yoshitom/.cache/torch/hub/checkpoints/bmshj2018-hyperprior-5-f8b614e1.pth.tar
100.0%
{
  "name": "bmshj2018-hyperprior",
  "description": "Inference (ans)",
  "results": {
    "psnr": [
      NaN
    ],
    "ms-ssim": [
      NaN
    ],
    "bpp": [
      0.6686876085069443
    ],
    "encoding_time": [
      0.034142365058263145
    ],
    "decoding_time": [
      0.025616129239400227
    ]
  }
}

python -m compressai.utils.eval_model pretrained ./kodak/ -a bmshj2018-hyperprior --metric ms-ssim --quality 5 --cuda

metric = ms-ssim

Downloading: "https://compressai.s3.amazonaws.com/models/v1/bmshj2018-hyperprior-ms-ssim-5-c34afc8d.pth.tar" to /home/yoshitom/.cache/torch/hub/checkpoints/bmshj2018-hyperprior-ms-ssim-5-c34afc8d.pth.tar
100.0%
{
  "name": "bmshj2018-hyperprior",
  "description": "Inference (ans)",
  "results": {
    "psnr": [
      NaN
    ],
    "ms-ssim": [
      NaN
    ],
    "bpp": [
      0.47353786892361116
    ],
    "encoding_time": [
      0.03800355394681295
    ],
    "decoding_time": [
      0.029240707556406658
    ]
  }
}

I didn't check all the combinations (model, quality, metrics, with/without CUDA), but at least bmshj2018-hyperprior with quality=8 (besides one with quality=5) also returned NaN when using CUDA (for both mse and ms-ssim checkpoints). There may be more checkpoints that face the same issue.

When I checked the output from a model (i.e., out_dec["x_hat"]), some value in the tensor is NaN when using CUDA and that must have caused this issue.

Error:: Model Update and Model Evaluation

First of all, thank you for the awesome work :D

After training, I'm trying to run compressai.utils.update_model on my training checkpoint.

So, I used:

python -m compressai.utils.update_model -h

And:

python -m compressai.utils.update_model /content/checkpoint_best_loss.pth.tar

(Should I use checkpoint_best_loss.pth.tar or checkpoint_pth.tar??)

It returned the following error:

Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.7/dist-packages/compressai/utils/update_model/__main__.py", line 139, in <module>
    main(sys.argv[1:])
  File "/usr/local/lib/python3.7/dist-packages/compressai/utils/update_model/__main__.py", line 110, in main
    net = model_cls.from_state_dict(state_dict)
  File "/usr/local/lib/python3.7/dist-packages/compressai/models/priors.py", line 277, in from_state_dict
    net.load_state_dict(state_dict)
  File "/usr/local/lib/python3.7/dist-packages/compressai/models/priors.py", line 267, in load_state_dict
    state_dict,
  File "/usr/local/lib/python3.7/dist-packages/compressai/models/utils.py", line 108, in update_registered_buffers
    dtype,
  File "/usr/local/lib/python3.7/dist-packages/compressai/models/utils.py", line 54, in _update_registered_buffer
    new_size = state_dict[state_dict_key].size()
KeyError: 'gaussian_conditional._quantized_cdf'

What's wrong?

After this I can run the evaluation, right?
But which positional argument, should I use? Pretrained or Checkpoint?

Running this:

usage: __main__.py [-h] {pretrained,checkpoint} ...

Evaluate a model on an image dataset.

positional arguments:
  {pretrained,checkpoint}
                        model source

optional arguments:
  -h, --help            show this help message and exit

Could you help me with these two topics, please?
Thank you.

Doubts about command lines

Hello, first, thank you for sharing the good work.

I'm trying to use the plot command, to know the repository, described at: https://interdigitalinc.github.io/CompressAI/cli_usage.html

Please, what should I put on -f ? I didn't understand.
What is this: -f [ [ ...]], --results-file [ [ ...]] ?

Other thing is: after update the model with my own trained model, for example, will I be able to run the command:
python3 examples/codec.py with an own image ?

Thanks.

Difference between forward vs compress/decompress reconstruction

Hello,

I have a question for a better understanding of your very useful and nice library, and it would be great if you could add a similar example to your example folder for others.

I did a simple test and noticed there is a difference between actual reconstruction results (obtained by compress/decompress functions) and the one obtained by the forward function. The difference is in both the reconstructed results and the estimated bits. However, if I clamp the output of the forward function then there is no difference in reconstruction results, but still, there is a difference between theoretical bit rates and actual bitrates. So, I have two questions in that regards:
1- Does it mean that the compress and decompress function somehow clamp the results? i.e., there is no need to clamp the output by ourselves?
2- Does the difference between theoretical and actual bitrates come from the practical implementation of the encoder that imposes some extra bits for tasks such as the "end of file" symbol, discretization of everything into bits, etc.)

Here is a simple code to test:

import math
import torch
from torchvision import transforms
from PIL import Image

def compute_theoretical_bits(out_net):
    list_latent_bits = [torch.ceil((torch.log(likelihoods).sum(dim=(1, 2, 3)) / (-math.log(2)))) for likelihoods in out_net['likelihoods'].values()]
    total_bits_per_image = torch.sum(torch.stack(list_latent_bits, dim=0), dim=0).long()
    return total_bits_per_image

def compute_actual_bits(compressed_stream):
    list_latent_bits = [torch.tensor([len(s) * 8 for s in list_s]) for list_s in compressed_stream["strings"]]
    total_bits_per_image = torch.sum(torch.stack(list_latent_bits, dim=0), dim=0)
    return total_bits_per_image

from compressai.zoo import bmshj2018_hyperprior

device = 'cuda' if torch.cuda.is_available() else 'cpu'
net = bmshj2018_hyperprior(quality=2, pretrained=True).eval().to(device)
net.update(force=True)  # update the model CDFs parameters.

print(f'Parameters: {sum(p.numel() for p in net.parameters())}')
print(f'Entropy bottleneck(s) parameters: {sum(p.numel() for p in net.aux_parameters())}')

img = Image.open('../data/stmalo_fracape.png').convert('RGB')
x = transforms.ToTensor()(img).unsqueeze(0)
x = x.to(device)
with torch.no_grad():
    #output of training
    out_net = net(x)
    out_net['x_hat'].clamp_(0, 1)
    bits_per_image = compute_theoretical_bits(out_net)

    # output of real compression and decompression
    compressed = net.compress(x)
    compressed_bits_per_image = compute_actual_bits(compressed)
    decompressed = net.decompress(compressed["strings"], compressed["shape"])
    # decompressed['x_hat'].clamp_(0, 1) # no need to clamp decompressed results?

    diff = (out_net["x_hat"] - decompressed["x_hat"]).abs()
    diff_in_bits = (bits_per_image - compressed_bits_per_image).abs()
    print("max difference={}, min difference={}".format(diff.max(), diff.min()))
    print("diff in bits={}, ratio (compressed/training)={}%".format(diff_in_bits, torch.div(compressed_bits_per_image, bits_per_image)))

    isCloseReconstruction = torch.allclose(out_net["x_hat"], decompressed["x_hat"], atol=1e-06, rtol=0)
    isCloseBits = torch.allclose(bits_per_image, compressed_bits_per_image, atol=0, rtol=1e-2)
    assert isCloseReconstruction, "The output of decompressed image is not equal to image"
    assert isCloseBits, "The number of compressed bits is not equal to the number of bits computed in training phase"

[BUG ?] Theoretical BPP > Empirical BPP for mbt2018-mean

Hi!

When running the comparison notebook, I tried also computing the real BPP (instead of only the theoretical one), for every model it was quite close to the theoretical one (slightly larger which makes sense), but for mbt2018-mean the theoretical BPP is larger than the empirical one, which I don't think should be possible.

Here's the few lines of code I added, as well as the surprising result (in red)

I looked at the code and it seems that MeanScaleHyperprior does not take absolute values of y (i.e. both in compression and forward you have z = self.h_a(y)), but ScaleHyperprior does (i.e. z = self.h_a(torch.abs(y))). Could that be the reason ? I thought that the only difference between MeanScaleHyperprior and ScaleHyperprior was that one predicted the mean of the gaussian and the other didn't.

Thanks for the library 💯

Floating point exception during model.update()

Bug

Hello,
I get a floating point exception when trying to update a CompressionModel. There is no stack for the error message, so I guess it comes from an internal C module.

Using prints, I could trace that the problem comes from:

model.update() -> self._pmf_to_cdf(pmf, tail_mass, pmf_length, max_length) -> _cdf = pmf_to_quantized_cdf(prob, self.entropy_coder_precision)

and that it was raised when calling the function on a Tensor p where all entries but one are zero

To Reproduce

Steps to reproduce the behavior:

Call

prob = torch.cat((p[: pmf_length[i]], tail_mass[i]), dim=0)
 _cdf = pmf_to_quantized_cdf(prob, self.entropy_coder_precision)

on the following tensor:

tensor([0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.6690e-33,
6.5444e-11, 1.0000e+00, 1.2011e-13, 1.0696e-36, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00,
0.0000e+00, 0.0000e+00], device='cuda:0', grad_fn=)

Expected behavior

I don't know what the returned value should be, but it seems that my problem is a corner case incorrectly handled

Environment

PyTorch version: 1.7.0
Is debug build: True
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.4 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: version 3.10.2

Python version: 3.7 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: 9.1.85
GPU models and configuration:
GPU 0: Tesla V100-SXM2-32GB
GPU 1: Tesla V100-SXM2-32GB
GPU 2: Tesla V100-SXM2-32GB
GPU 3: Tesla V100-SXM2-32GB

Nvidia driver version: 440.33.01
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.4
[pip3] pytorch-msssim==0.2.0
[pip3] torch==1.7.0
[pip3] torch-cluster==1.5.8
[pip3] torch-geometric==1.6.3
[pip3] torch-scatter==2.0.5
[pip3] torch-sparse==0.6.8
[pip3] torch-spline-conv==1.2.0
[pip3] torchvision==0.8.1
[conda] numpy 1.19.4 pypi_0 pypi
[conda] pytorch-msssim 0.2.0 pypi_0 pypi
[conda] torch 1.7.0 pypi_0 pypi
[conda] torch-cluster 1.5.8 pypi_0 pypi
[conda] torch-geometric 1.6.3 pypi_0 pypi
[conda] torch-scatter 2.0.5 pypi_0 pypi
[conda] torch-sparse 0.6.8 pypi_0 pypi
[conda] torch-spline-conv 1.2.0 pypi_0 pypi

- PyTorch / CompressAI Version (e.g., 1.0 / 0.4.0): torch   1.7.0, compressai  1.1.5
- OS (e.g., Linux): Ubuntu 18.04.4 LTS (Bionic Beaver)
- How you installed PyTorch / CompressAI (`pip`, source): pip 
- Build command you used (if compiling from source):
- Python version: 1.7.0
- CUDA/cuDNN version: 10.2
- GPU models and configuration:
- Any other relevant information: Problem appears both on cpu and gpu

Quantization in `JointAutoregressiveHierarchicalPriors`

Thank you for your great work.

I noticed that torch.round is directly used in the implementation of JointAutoregressiveHierarchicalPriors.
https://github.com/InterDigitalInc/CompressAI/blob/master/compressai/models/priors.py#L549
I think the quantization should be performed using self.gaussian_conditional like other classes extended from CompressionModel. More specifically, self.gaussian_conditional._quantize should be used.

This implementation is consistent with that for dequantization.
https://github.com/InterDigitalInc/CompressAI/blob/master/compressai/models/priors.py#L637

Small error in printing status

Hello,

Thank you for this nice library. I noticed that in your CompressAI/examples/train.py example, inside the test_epoch function when you are printing 'Average losses:' you are actually printing the last values rather than average values. I think

    print(f'Test epoch {epoch}: Average losses:'
          f'\tLoss: {loss.val:.3f} |'
          f'\tMSE loss: {mse_loss.val:.3f} |'
          f'\tBpp loss: {bpp_loss.val:.2f} |'
          f'\tAux loss: {aux_loss.val:.2f}\n')

Must be changed to

    print(f'Test epoch {epoch}: Average losses:'
          f'\tLoss: {loss.avg:.3f} |'
          f'\tMSE loss: {mse_loss.avg:.3f} |'
          f'\tBpp loss: {bpp_loss.avg:.2f} |'
          f'\tAux loss: {aux_loss.avg:.2f}\n')

I know that it is very subtle, but just for the sake of completeness I am reporting this error.

Inconsistency between training and testing procedures

Thanks for your work! I notice there seems to have inconsistency between training and testing procedures, regarding entropy coding of latent representation y and hyperprior z.

During training, quantized [y] (which is simulated by using additive Gaussian noise) is coded. Yet during testing, quantized [y - means_hat] is coded. And the same for hyperprior z.

My question is why training and testing procedures are different? Is there any special consideration for this? And will this influence coding performance?

Encoding/Decoding time

Hello Jean,

I evaluate the pretrained models on any datasets I found different encoding/decoding time depending on the quality of the reconstruction for the same model. Is this to be expected? I thought the time would be the same if the network does not change.
Maybe it's just jobs in my background that may slow down things during runtime but it is consistent every time. Higher quality always gives a slightly higher running time.

Thank you.

RuntimeError in update the bottleneck parameters

Hello,
First of all, thank you for providing this very nice library. I faced an error that I wanted to share with you to fix it. Let's say, I have a model which is trained for several epochs and now it is saved using save_checkpoint function as you are doing in example CompressAI/examples/train.py. Since you haven't mentioned the load part here, in order to load this checkpoint to continue training, I do as follows:

device = "cuda" if args.cuda and torch.cuda.is_available() else "cpu"
net = AutoEncoder()
net = net.to(device)
optimizer = optim.Adam(net.parameters(), lr=args.learning_rate)
aux_optimizer = optim.Adam(net.aux_parameters(), lr=args.aux_learning_rate)
criterion = RateDistortionLoss(lmbda=args.lmbda)
checkpoint = torch.load("checkpoint.pth.tar")
net.load_state_dict((checkpoint["net_state_dict"]))
net.update(force=True)  # update the model CDFs parameters.

First, I wanted to be sure if I am doing it in a correct way (for example, I use net.update with force=True to update the entropy model parameters, etc.)

Second, if I do so, I get an error in update function of class EntropyBottleneck(EntropyModel)

samples = samples[None, :] + pmf_start[:, None, None]
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

This is because

samples.device= cpu
pmf_start.device= cuda:0

To fix this, you should change your
samples = torch.arange(max_length)
to
samples = torch.arange(max_length, device=pmf_start.device)
in your def update(self, force=False).

Since I am not using exactly the code explained above (I am changing your original code a little bit, so to simlpify the explanation I used your simplified train.py code), first I want you to verify if you have this problem on your side. Then, if it is the case and you fixed it, I think similar problem must be somewhere else in the code as well.

Again I wanted to say thank you for this wonderful library you have provided.
Best,
Navid

About evaluation

Hello,
At first, thanks for your work!

I have trained the ScaleHyperprior model with my dataset, and I'm trying to use inference() to evaluate my checkpoint with my dataset, but several images' PSNR are NaN(1~4 in 24 test images) and it appeared randomly. Everytime I run my evaluation code, the images with NaN PSNR are different and the other images‘ PSNR are right. I have checked the decompressed images with NaN PSNR, and there are many block mosaics and serious artifact in these images. However, the results of inference_entropy_estimation() are all right. I'm not sure whether my evaluation code is correct, my code is as follows(the ScaleHyperprior(128, 192) is the same to the source code):

net = ScaleHyperprior(128, 192)
net = net.from_state_dict(torch.load(checkpoint_path)["state_dict"])
net.update(force=True)
net= net.eval().cuda()

After these steps, I fed the test images and the net into inference() to get bpp and PSNR.

So what's wrong with this? Is this a bug or my code is wrong?

I use compressai 1.1.4.
Thanks!

error in eval_model

Hello,
Could you please provide the exact commands to replicate the curves using the pretrained models in compressai?

Using the pretrained models in compressai, I wanted to replicate the RD curves shown in the README.md

I tried python -m compressai.utils.eval_model -h and python -m compressai.utils.eval_model --help to see how to execute, but it returned the following error.

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/yoshitom/.local/share/virtualenvs/compressai-doaxpoXd/lib/python3.6/site-packages/compressai/utils/eval_model/__main__.py", line 303, in <module>
    main(sys.argv[1:])
  File "/home/yoshitom/.local/share/virtualenvs/compressai-doaxpoXd/lib/python3.6/site-packages/compressai/utils/eval_model/__main__.py", line 254, in main
    args = setup_args().parse_args(argv)
  File "/home/yoshitom/.local/share/virtualenvs/compressai-doaxpoXd/lib/python3.6/site-packages/compressai/utils/eval_model/__main__.py", line 217, in setup_args
    help="model source", dest="source", required=True
  File "/usr/lib/python3.6/argparse.py", line 1716, in add_subparsers
    action = parsers_class(option_strings=[], **kwargs)
TypeError: __init__() got an unexpected keyword argument 'required'

OS: Ubuntu 20
Python: 3.6.9
compressai: 1.1.2

Thank you!

Bug in JointAutoregressiveHierarchicalPriors._compress_ar

https://github.com/InterDigitalInc/CompressAI/blob/master/compressai/models/priors.py
In JointAutoregressiveHierarchicalPriors._compress_ar:
ctx_p = F.conv2d( y_crop, self.context_prediction.weight, bias=self.context_prediction.bias )
self.context_prediction.weight ignores the mask, leading to the context mismatch between _compress_ar and _decompress_ar.
y_hat in _compress_ar is the full feature and needs to be masked.

Solution:
masked_weight = self.context_prediction.weight * self.context_prediction.mask
ctx_p = F.conv2d( y_crop, masked_weight, bias=self.context_prediction.bias )

Evaluation

I trained(In examples/train.py) a model with my own dataset. But i don't know how to evaluate with toy model, can u help me ? If it is not possible, how can i do that with my own dataset using bmshj2018-factorized architecture ? I should use my own dataset. Thank you.

Decompress multiple images

Hi,

Right now you cannot decompress multiple images using EntropyBottleneck.decompress (but you can compress multiple images at once). As a result bttleneck.decompress(bttleneck.compress(images), shape) would brake. This is because on this line you use self._medians().detach().view(1, -1, 1, 1) (i.e. batch size is one for medians even though it is batch_size for indices) but on this line you actually explicitly ensure that the batch size for indices and medians is the same.

During compression you don't have this problem because you don't explicitly test for equal batch_size and broadcasting takes care of the rest. I would suggest one of the following simple solutions

either removing the explicit ensuring of equal batch size (might brake downstream though)
simply use self._medians().detach().expand(len(strings), -1, 1, 1) instead of self._medians().detach().view(1, -1, 1, 1) here. This only uses an expanded view so there's no memory usage increase.

How to estimate bpp of a standalone tensor using EntropyBottleneck() layer?

I want to estimate the bpp of a tensor without compressing it, and tried to use the EntropyBottleneck layer of CompressAI for this purpose. However, when I used it on a 512 x 512 image tensor and a random tensor, I got nearly the same bpp value of around 20 with an untrained EntropyBottleneck layer in either case. Theoretically, the random tensor should have much higher bpp than the image, and also, since the pixel values for both the tensors lie between 0-255, the bpp should be no more than 8 even for a fully randomized tensor (I am not sure if the data type of torch.float32 plays a role here). However, this was an untrained layer, and naturally, I thought that the bpps returned are just random values and training the layer would fix the issue, and given me an close estimate of the true bpp. But even after thousands of iterations, the bpp values for both tensors did not change from their initial value (only the 3rd and 4th digits after decimal fluctuated, without a monotonic increase or decrease), even though the entropy bottleneck loss steadily decreased.

Here is a minimal working example of the issue. I tried to closely follow this tutorial for custom model, without the encoder and decoder layers (as I do not need to compress the tensor). I am just showing the example with the image tensor to keep it short. The random tensor has the same behavior.

from skimage import data
import math

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

from compressai.entropy_models import EntropyBottleneck
from compressai.models import CompressionModel

# Define a network having only the EntropyBottleneck layer
class Network(CompressionModel):
    def __init__(self, N=128):
        super().__init__(N) # using super().__init__() as given in the tutorial resulted in an error: __init__() missing 1 required positional argument: 'entropy_bottleneck_channels'
        self.entropy_bottleneck = EntropyBottleneck(N)
        

    def forward(self, x):
        
        x_hat, x_likelihoods = self.entropy_bottleneck(x)
        return x_hat, x_likelihoods

# initialize the network with entropy bottleneck layer with only 1 channel for a single grayscale image
net = Network(N=1) 

# read image data and convert to tensor
image = data.camera() # read cameraman image
H, W = image.shape
img_tensor = torch.tensor(image, dtype=torch.float32).view(-1, 1, H, W)
# rand_tensor = torch.randint(0,255,(1,1,H,W), dtype=torch.float32) 
N, _, H, W = img_tensor.size()
num_pixels = N * H * W

# compute bpp with untrained layer
img_approx, img_likelihood = net(img_tensor)
bpp_loss_img = torch.log(img_likelihood).sum() / (-math.log(2) * num_pixels)
print ("Image BPP", bpp_loss_img)

# Train EntropyBottleneck layer

aux_parameters = set(p for n, p in net.named_parameters() if n.endswith(".quantiles"))
aux_optimizer = optim.Adam(aux_parameters, lr=1e-3)
for i in range(1000):
    aux_optimizer.zero_grad()
    img_hat, img_likelihoods = net(img_tensor)
    aux_loss = net.aux_loss()
    aux_loss.backward()
    aux_optimizer.step()
    bpp_loss_img = torch.log(img_likelihoods).sum() / (-math.log(2) * num_pixels)
    print(bpp_loss_img, aux_loss)

I also tried using an EntropyBottleneck by itself, i.e. without the custom network inheriting CompressionModel, but the outcome was the same. It will be very helpful if you could help me debug this issue and/or let me know how can I estimate the bpp of a standalone tensor using this library. Thanks.

GMM moduls is not used in cheng2020-anchor?

Excuse me, I saw the cheng2020-anchor code and found that the entropy model is used with a single Gaussion model, but in the author paper, the Gaussion Mixture Model (GMM) is used, could you fresh these code? Thanks~

Evaluation Script

Hello,

First of all, thank you for your work.

I trained a model (bmshj2018-factorized) on my own dataset and then tried to do some tests with your evaluation script.

I used train.py, exactly as you recommend in Readme.md or in Issue 32 (now closed).
python3 train.py -d $DATASET --epochs 300 -lr 1e-4 --batch-size 16 --cuda --save

Then,
python3 -m compressai.utils.eval_model checkpoint /path/to/images/folder/ -a bmshj2018-factorized -p checkpoint.pth.tar

Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/udd/dbacchus/Codes/CompressAI/compressai/utils/eval_model/main.py", line 301, in
main(sys.argv[1:])
File "/udd/dbacchus/Codes/CompressAI/compressai/utils/eval_model/main.py", line 277, in main
model = load_func(*opts, run)
File "/udd/dbacchus/Codes/CompressAI/compressai/utils/eval_model/main.py", line 145, in load_checkpoint
return architectures[arch].from_state_dict(torch.load(checkpoint_path)).eval()
File "/udd/dbacchus/Codes/CompressAI/compressai/models/priors.py", line 165, in from_state_dict
N = state_dict["g_a.0.weight"].size(0)
KeyError: 'g_a.0.weight'

This is the same error message as in Issue 22 but I don't understand the cause here as I'm not training a toy example not defined in the reference CompressAI models.

When I run,
print(torch.load("checkpoint_best_loss.pth.tar").keys())

I get,
dict_keys(['epoch', 'state_dict', 'loss', 'optimizer', 'aux_optimizer', 'lr_scheduler'])

Thank you.

Entropy model in Cheng2020Anchor

Firstly, thanks authors for sharing the amazing work. I have a question about the Cheng2020Anchor model. In Cheng's original paper, the discretized Gaussian mixture likelihood is used where three Gaussian distributions are weighted. However, I find that the Cheng2020Anchor model still uses the single Gaussian distribution in JointAutoregressiveHierarchicalPriors. Is my understanding right? Thanks

argparse does not like metavar as empty strings

Hello,

When I use python3 -m compressai.utils.bench vtm --help I get the following error:

File "/usr/lib/python3.8/argparse.py", line 349, in _format_usage
assert ' '.join(opt_parts) == opt_usage
AssertionError

When I searched on net I found this link that explains

argparse does not like metavar=''.

Either give them a name or remove the options - but don't leave them as empty strings.

When I comment the metavar='' in compressai/utils/bench/codecs.py, then the program works properly. The same problem exists for hm as well.

Thank you very much again for this nice package.

About get_scale_table(min=SCALES_MIN, max=SCALES_MAX, levels=SCALES_LEVELS)

Documentation

The original default values are:
#https://github.com/InterDigitalInc/CompressAI/blob/release/1.1.1/compressai/models/priors.py

SCALES_MIN = 0.11
SCALES_MAX = 256
SCALES_LEVELS = 64

When the input image data is 16 bit instead of 8 bit, should these values be changed according to the data range? and how to decide the SCALES_LEVELS with new data range?

Unable to train

First of all thank you for your amazing work. I have a question about training. I am trying to train MeanScaleHyperprior network with (192, 192) channels using vimeo90k set. I randomly crop 256*256 patches, use 16 as batch size and set lamda=0.0018 (quality 1). However I am not able to properly train the network so that the results match with the plots on https://interdigitalinc.github.io/CompressAI/zoo.html. There is one small difference on reading images. I use imageio.imread which returns 0-255 range, then I normalize by dividing 255. I observe no significant change after 500k iterations and stuck at 26.86 dB, 0.30 bpp on Kodak dataset. Do you have any guess what may be the reason? Thanks

Is there any data augmentation strategy used during training?

I try to train ScaleHyperprior model using code provide in example/train.py with Imagenet2012/DIV2K dataset. The learning rate for main optimizer and aux_optimizer both set to 1e-4. The learning rate of the main optimizer is then divided by 2 when the evaluation loss reaches a plateau as it described in https://interdigitalinc.github.io/CompressAI/zoo.html. However, atfer nearly 1 week's training, the rate-distortion performance on Kadok-24 still has a gap compared to the result provided.

I am also training model on vimeo_test_clean from Vimeo90K after 2 days, it seems will not to converge to the result provided.
Have I missed something? Is there any data augmentation strategy used during training?

ModuleNotFoundError: No module named 'compressai._CXX'

I am getting this error : ModuleNotFoundError: No module named 'compressai._CXX'

Thanks from now.

Zoo models crash at large image size.

I have a dataset of image if 1280x720 and it seems the zoo models cannot compress these, I get all kinds of errors related to mismatching tensor sizes. Is this expected due to image size or a bug?

`mean_hat` is not used for `y_hat` in `JointAutoregressiveHierarchicalPriors` in the evaluation mode

I find that mean_hat is not considered for calculating y_hat at the forward function of JointAutoregressiveHierarchicalPriors in the evaluation mode.
https://github.com/InterDigitalInc/CompressAI/blob/master/compressai/models/priors.py#L476

y_hat should be calculated as dequantize(quantize(y - mean_hat)) + mean_hat in the evaluation mode.
It is also implemented in the compress function and forward in MeanScaleHyperprior.

diction keyword error

Bug

To Reproduce

Steps to reproduce the behavior:

1.train your own model
2.update your own model
3. evaluate your own model
all steps are using -a bmshj2018-factorized -p checkpoint_best_loss.pth.tar

Expected behavior

CompressAI/compressai/models/priors.py", line 161, in from_state_dict:
N = state_dict["g_a.0.weight"].size(0) requires a diction including keywards "g_a.xxxxxx", but this item is in the `state_dict["state_dict"]["g_a.0.weght"]

there are also some lines using state_dict, where the same erros happened

Additional context

A question about bpp calculation

I am sorry to bother you again, but I have a question about bpp calculation. I construct a toy network, which try to compress a float data that is randomly generated. After training, I do a inference also on the train data. What makes me confused is that the actual bpp(calculated by the length of compressed string) is much larger than the theoretical bpp(calculated by likelihoods). But shouldn't the actual bpp be almost the same as the theoretical bpp? Do I leave out something? Hope for your reply, thanks a lot.
The following is code, I first randomly generate a data, whose size is batch*channel(10*64), and the input size of entropy bottleneck is 10*16:

from compressai.models import CompressionModel
import torch
from torch.nn import Linear
import torch.nn as nn
import torch.optim as optim
from torch.nn import MSELoss
import math

class Network(CompressionModel):
    def __init__(self):
        super().__init__(entropy_bottleneck_channels=16)
        self.encoder = nn.Sequential(
            Linear(64,32),
            Linear(32, 16)
        )

        self.decoder = nn.Sequential(
            Linear(16, 32),
            Linear(32, 64)
        )

    def forward(self,x):
        y = self.encoder(x)
        y_hat, y_likelihoods = self.entropy_bottleneck(y)
        x_hat = self.decoder(y_hat)

        return x_hat, y_likelihoods

# mse loss
mloss = MSELoss()

# Data
torch.manual_seed(10)
data = torch.rand(10,64).float().cuda()

# Model
model = Network().cuda()
# optimizer
parameters = set(p for n, p in model.named_parameters() if not n.endswith(".quantiles"))
aux_parameters = set(p for n, p in model.named_parameters() if n.endswith(".quantiles"))
optimizer = optim.Adam(parameters, lr=1e-4)
aux_optimizer = optim.Adam(aux_parameters, lr=1e-3)

# train
for i in range(1, 10001):
    optimizer.zero_grad()
    aux_optimizer.zero_grad()

    x_hat, y_likelihoods = model(data)

    mse_loss = mloss(x_hat, data)
    B, C = data.size()
    bpp_loss = torch.log(y_likelihoods).sum() / (-math.log(2) * B )
    distortion_loss = mse_loss + 1e-2 * bpp_loss
    distortion_loss.backward()
    optimizer.step()

    aux_loss = model.aux_loss()
    aux_loss.backward()
    aux_optimizer.step()

    if i %1000 == 0:
        print("iteration:", i)
        print("distortion loss:", distortion_loss.item())
        print('bpp_loss:',bpp_loss.item())
        print('mse_loss:',mse_loss.item())
        print('aux_loss:',aux_loss.item())

torch.save(model.state_dict(),'./model.pth')

# # load
model2 = Network().cuda()
ckpt = torch.load('./model.pth')
model2.load_state_dict(ckpt)
model2.update()

# theoretical bpp
_, y_likelihoods = model2.forward(data)
print('theoretical bpp:', (torch.log(y_likelihoods).sum() / (-math.log(2) * 10 )).item())

# actual bpp
x = model2.encoder(data)
string = model2.entropy_bottleneck.compress(x)
bpp = sum(len(s) for s in string) * 8.0 / 10
print('actual bpp:', bpp)

And here the theoretical bpp is 4.49, but the actual bpp is 64.

question about _likelihood

I am a beginner in image compression. I don’t understand why lower, upper and sign are calculated for this step.

BMSHJ2018 Parameters

I was wondering if the training parameters for both 'bmshj2018_factorized' & 'bmshj2018_hyperprior' were anywhere in the documentation? E.g. the dataset, optimisation, learning rates, training epochs ect.

I am assuming that the quality parameter loads different pre-trained networks depending on the lambda parameter they were trained with in the rate-distortion loss. It would be good to what lambda corresponds to the quality parameter.
Thanks!

How to evaluate the example model with CompressAI utils?

Thanks for the great work!
After I train the example network like this,
python3 examples/train.py -d /path/to/my/image/dataset/ --epochs 300 -lr 1e-4 --batch-size 16 --cuda --save
I'm trying to evaluate the performance of the model. But I couldn't find any tips about use the command line to evaluate the example model.
The tips of evaluation part is that
python3 -m compressai.utils.eval_model checkpoint /path/to/images/folder/ -a $ARCH -p $MODEL_CHECKPOINT...
Which requires to enter the architecture of the model, and all of the arch are the pre-trained.
So is there any method to use command line to call CompressAI utils to evaluate a new model?

A small question about updating entropybottleneck

Hi, I really appreciate for your work, but I have a small question about updating entropybottleneck. That is, I insert a entropybottleneck in my network, and I save it after training without calling updata. When I do inference, I load checkpoint first and then call the update function of entropybottleneck, should this work ok？ Or I should update entropybottleneck before save, and load checkpoint without update when I do inference. Thanks a lot.

interdigitalinc / compressai Goto Github PK

compressai's People

Contributors

Stargazers

Watchers

Forkers

compressai's Issues

1. Support DistributedDataParallel and DataParallel

2. Publish Python package

Looks like working well without CUDA

metric = mse

metric = ms-ssim

PSNR and MS-SSIM are both NaN when using CUDA

metric = mse

metric = ms-ssim

Bug

To Reproduce

Expected behavior

Environment

Documentation

Bug

To Reproduce

Expected behavior

Additional context

Recommend Projects

Recommend Topics

Recommend Org