jmliu206 / lic_tcm Goto Github PK

View Code? Open in Web Editor NEW

137.0 137.0 21.0 1.14 MB

License: MIT License

Python 100.00%

lic_tcm's People

Contributors

Stargazers

Watchers

lic_tcm's Issues

Pretrained models

Hi, my research interest focuses on end-to-end video coding. I would really appreciate it if you can provide the pretrained models of all bit-rate coding scenario. Thanks!

I am trying to run the training script on colab. I used
!pip install torch torchvision torchaudio compressai==1.2.0 einops timm pillow==10.0.0
to set up the environment and then downloaded and rearranged the Kodak dataset as per compressAI's format.

When I try to run
!CUDA_VISIBLE_DEVICES='0' python train.py -d data/ --cuda --N 128 --lambda 0.05 --epochs 50 --num-workers 1 --lr_epoch 45 48 --save_path ./pretrained --save

I get the following output:
2023-08-03 15:23:43.125043: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-08-03 15:23:44.183693: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
model : bmshj2018-factorized
dataset : data/
epochs : 50
learning_rate : 0.0001
num_workers : 1
lmbda : 0.05
batch_size : 8
test_batch_size : 8
aux_learning_rate : 0.001
patch_size : (256, 256)
cuda : True
save : True
seed : 100
clip_max_norm : 1.0
checkpoint : None
type : mse
save_path : ./pretrained
skip_epoch : 0
N : 128
lr_epoch : [45, 48]
continue_train : True
cuda
milestones: [45, 48]
Learning rate: 0.0001
Traceback (most recent call last):
File "/content/LIC_TCM/train.py", line 426, in
main(sys.argv[1:])
File "/content/LIC_TCM/train.py", line 391, in main
train_one_epoch(
File "/content/LIC_TCM/train.py", line 121, in train_one_epoch
for i, d in enumerate(train_dataloader):
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 633, in next
data = self._next_data()
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1345, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1371, in _process_data
data.reraise()
File "/usr/local/lib/python3.10/dist-packages/torch/_utils.py", line 644, in reraise
raise exception
PIL.UnidentifiedImageError: Caught UnidentifiedImageError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.10/dist-packages/compressai/datasets/image.py", line 75, in getitem
img = Image.open(self.samples[index]).convert("RGB")
File "/usr/local/lib/python3.10/dist-packages/PIL/Image.py", line 3280, in open
raise UnidentifiedImageError(msg)
PIL.UnidentifiedImageError: cannot identify image file '/content/LIC_TCM/data/train/img003.png'

I checked the image, it is not corrupt nor 0 bytes. Can you please give me some inputs on what might be the cause of this issue?

Thanks in advance.

For MS-SSIM result

Test result

Hello, I was wondering if you could give the PSNR and BPP values of your method on common test sets (e.g. Kodak, Tecnick, CLIC) as I would like to compare with your method.

SSIM RD_data

Hello, could you provide RD_data about ssim on Kodak dataset?

Question on the parameter count presented in Table 1

Hello @jmliu206,

thank you very much for providing your interesting work.

Could you please explain in more detail how exactly you calculate the parameter number given in Table 1? According to your paper, your small model should have a parameter count of 44.96M, while I get about 76M when testing your code. I have created a colab to reproduce this result:

https://colab.research.google.com/drive/1KdwoC1i-TYMtc3akyuX83exipynKEE4v?usp=sharing

I have used the default setting with C=128 - probably I am just missing some details here...

I was also a bit surprised by the reported number of model parameters for SwinT-ChARM. According to Zhu et al., they have a total of 32.6M (Table 3), whereas you report 60.55M.

It would be great if you could provide further insights here.

Thanks in advance,
Nikolai

How to save compressed images

How to save compressed images and How much memory the compressed image need?in this model， some methods like np.savez or h5py, will be less than jpg or bmp ？

LIC-TCM now available in TensorFlow2

Dear @jmliu206,

I am happy to share that LIC-TCM is now also available in TensorFlow 2:
https://github.com/Nikolai10/LIC-TCM

It would be great if you could include the TensorFlow implementation in your README :) Thanks!

Some questions about the calculations of bpp

Hi,l have some questions about the calculations of bpp.
In your code, you don't use the z_strings to calculate the bpp and you just use the y_strings to calculate it. l think that it is important to consider the z_string.

CompressAI version

Hi, thanks for your wonderful works!

Could you please provide your version of CompressAI? I have been trying to use the latest version 1.2.4, but I encountered an issue with the following error message: 'TypeError: TCM.init() got an unexpected keyword argument 'entropy_bottleneck_channels.''

I suspect that this issue might be related to a version inconsistency with CompressAI.

Could you please provide guidance on which version of CompressAI is compatible with my setup? Your help would be greatly appreciated.

About ERF

Hi, Sorry to bother you, I'd like to ask how the effective receptive field in Figure 3 of the paper is visualized and which layer is used as the target layer. Thanks~🧐

Configuration of experimental environment

Thank you very much for your excellent work. Could you provide a detailed configuration of your environment? I would like to follow your work

The Tcm model you proposed can be used in image segmentation

MS-SSIM optimized RD point

Hi, could you please provide detailed data on Kodak when optimized with ms-ssim loss function?

Cannot run

Here is the error,

Traceback (most recent call last):
  File "./train.py", line 427, in <module>
    main(sys.argv[1:])
  File "./train.py", line 364, in main
    net = TCM(config=[2,2,2,2,2,2], head_dim=[8, 16, 32, 32, 16, 8], drop_path_rate=0.0, N=args.N, M=320)
  File "/home/rong/LIC-TCM/models/tcm.py", line 312, in __init__
    super().__init__(entropy_bottleneck_channels=N)
  File "/home/rong/anaconda3/envs/LIC-TCM/lib/python3.8/site-packages/torch/nn/modules/module.py", line 445, in __init__
    raise TypeError("{}.__init__() got an unexpected keyword argument '{}'"
TypeError: TCM.__init__() got an unexpected keyword argument 'entropy_bottleneck_channels'

and my command,

CUDA_VISIBLE_DEVICES='0' python -u ./train.py -d ./data/ --cuda --N 128 --lambda 0.05 --epochs 50 --lr_epoch 45 48 --save_path ./checkpoint/ --save --checkpoint ./checkpoint/

There is not a argument called entropy_bottleneck_channels I think. What should I do?

The detailed result

Could you give the detailed results on Kodak, Tecnick and CLIC datasets? And the results aslo include small, middle and big models.

About the calculation of the bpp in the loss function.

More pre-trained models

Excuse me, could you provide more pre-trained models? I need to run forward inference again when testing other datasets. Thank you very much and I look forward to your reply.

Regarding the method comparison,

Hello, author. You randomly selected 300,000 images from the ImageNet dataset as your training set. Are the compared methods also retrained on the same set of 300,000 images you selected?

Inability to Extract Pretrained Model

I recently downloaded the pretrained model provided in the Google Drive link included in the project's README. The downloaded file is a .tar file with a size of 920381364 bytes. Unfortunately, I am encountering an issue when attempting to extract the files using the tar command. The specific error message that appears is "This does not look like a tar archive", indicating a possible file corruption or damage.

To help in troubleshooting this issue, could you kindly provide a MD5 checksum for the original file? This would allow us to verify the integrity of the downloaded file and ascertain whether it is indeed damaged or not.

Additionally, it would be beneficial if an alternative download method could be provided, apart from Google Drive. This may help circumvent any potential issues with the current download process.

The problem of training

Is "MSE LOSS is 0.001 or 0.000 from epoch 0 and 16000/183885 (9%)" right?

My training set is 180k+, testing set is 50+.

Is this situation reasonable??

Thanks a lot!

z_likehood problem

hello author,when i train the model with N=128 and lamba=0.05,i just trained one epoch, and all the value in z_likehood are almost become 1.00000, is this a normal situation ? do you meet this situation before?

jmliu206 / lic_tcm Goto Github PK

lic_tcm's People

Contributors

Stargazers

Watchers

Forkers

lic_tcm's Issues

Recommend Projects

Recommend Topics

Recommend Org