jmliu206 / lic_tcm Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Hi, my research interest focuses on end-to-end video coding. I would really appreciate it if you can provide the pretrained models of all bit-rate coding scenario. Thanks!
Thanks for sharing your implementation.
I am trying to run the training script on colab. I used
!pip install torch torchvision torchaudio compressai==1.2.0 einops timm pillow==10.0.0
to set up the environment and then downloaded and rearranged the Kodak dataset as per compressAI's format.
When I try to run
!CUDA_VISIBLE_DEVICES='0' python train.py -d data/ --cuda --N 128 --lambda 0.05 --epochs 50 --num-workers 1 --lr_epoch 45 48 --save_path ./pretrained --save
I get the following output:
2023-08-03 15:23:43.125043: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-08-03 15:23:44.183693: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
model : bmshj2018-factorized
dataset : data/
epochs : 50
learning_rate : 0.0001
num_workers : 1
lmbda : 0.05
batch_size : 8
test_batch_size : 8
aux_learning_rate : 0.001
patch_size : (256, 256)
cuda : True
save : True
seed : 100
clip_max_norm : 1.0
checkpoint : None
type : mse
save_path : ./pretrained
skip_epoch : 0
N : 128
lr_epoch : [45, 48]
continue_train : True
cuda
milestones: [45, 48]
Learning rate: 0.0001
Traceback (most recent call last):
File "/content/LIC_TCM/train.py", line 426, in
main(sys.argv[1:])
File "/content/LIC_TCM/train.py", line 391, in main
train_one_epoch(
File "/content/LIC_TCM/train.py", line 121, in train_one_epoch
for i, d in enumerate(train_dataloader):
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 633, in next
data = self._next_data()
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1345, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 1371, in _process_data
data.reraise()
File "/usr/local/lib/python3.10/dist-packages/torch/_utils.py", line 644, in reraise
raise exception
PIL.UnidentifiedImageError: Caught UnidentifiedImageError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/_utils/fetch.py", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.10/dist-packages/compressai/datasets/image.py", line 75, in getitem
img = Image.open(self.samples[index]).convert("RGB")
File "/usr/local/lib/python3.10/dist-packages/PIL/Image.py", line 3280, in open
raise UnidentifiedImageError(msg)
PIL.UnidentifiedImageError: cannot identify image file '/content/LIC_TCM/data/train/img003.png'
I checked the image, it is not corrupt nor 0 bytes. Can you please give me some inputs on what might be the cause of this issue?
Thanks in advance.
Hello, I was wondering if you could give the PSNR and BPP values of your method on common test sets (e.g. Kodak, Tecnick, CLIC) as I would like to compare with your method.
Hello, could you provide RD_data about ssim on Kodak dataset?
Hello @jmliu206,
thank you very much for providing your interesting work.
Could you please explain in more detail how exactly you calculate the parameter number given in Table 1? According to your paper, your small model should have a parameter count of 44.96M, while I get about 76M when testing your code. I have created a colab to reproduce this result:
https://colab.research.google.com/drive/1KdwoC1i-TYMtc3akyuX83exipynKEE4v?usp=sharing
I have used the default setting with C=128 - probably I am just missing some details here...
I was also a bit surprised by the reported number of model parameters for SwinT-ChARM. According to Zhu et al., they have a total of 32.6M (Table 3), whereas you report 60.55M.
It would be great if you could provide further insights here.
Thanks in advance,
Nikolai
How to save compressed images and How much memory the compressed image need?in this model, some methods like np.savez or h5py, will be less than jpg or bmp ?
Dear @jmliu206,
I am happy to share that LIC-TCM is now also available in TensorFlow 2:
https://github.com/Nikolai10/LIC-TCM
It would be great if you could include the TensorFlow implementation in your README :) Thanks!
Hi,l have some questions about the calculations of bpp.
In your code, you don't use the z_strings to calculate the bpp and you just use the y_strings to calculate it. l think that it is important to consider the z_string.
Hi, thanks for your wonderful works!
Could you please provide your version of CompressAI? I have been trying to use the latest version 1.2.4, but I encountered an issue with the following error message: 'TypeError: TCM.init() got an unexpected keyword argument 'entropy_bottleneck_channels.''
I suspect that this issue might be related to a version inconsistency with CompressAI.
Could you please provide guidance on which version of CompressAI is compatible with my setup? Your help would be greatly appreciated.
Thank you very much for your excellent work. Could you provide a detailed configuration of your environment? I would like to follow your work
The Tcm model you proposed can be used in image segmentation
Hi, could you please provide detailed data on Kodak when optimized with ms-ssim loss function?
Here is the error,
Traceback (most recent call last):
File "./train.py", line 427, in <module>
main(sys.argv[1:])
File "./train.py", line 364, in main
net = TCM(config=[2,2,2,2,2,2], head_dim=[8, 16, 32, 32, 16, 8], drop_path_rate=0.0, N=args.N, M=320)
File "/home/rong/LIC-TCM/models/tcm.py", line 312, in __init__
super().__init__(entropy_bottleneck_channels=N)
File "/home/rong/anaconda3/envs/LIC-TCM/lib/python3.8/site-packages/torch/nn/modules/module.py", line 445, in __init__
raise TypeError("{}.__init__() got an unexpected keyword argument '{}'"
TypeError: TCM.__init__() got an unexpected keyword argument 'entropy_bottleneck_channels'
and my command,
CUDA_VISIBLE_DEVICES='0' python -u ./train.py -d ./data/ --cuda --N 128 --lambda 0.05 --epochs 50 --lr_epoch 45 48 --save_path ./checkpoint/ --save --checkpoint ./checkpoint/
There is not a argument called entropy_bottleneck_channels I think. What should I do?
Could you give the detailed results on Kodak, Tecnick and CLIC datasets? And the results aslo include small, middle and big models.
Excuse me, could you provide more pre-trained models? I need to run forward inference again when testing other datasets. Thank you very much and I look forward to your reply.
Hello, author. You randomly selected 300,000 images from the ImageNet dataset as your training set. Are the compared methods also retrained on the same set of 300,000 images you selected?
I recently downloaded the pretrained model provided in the Google Drive link included in the project's README. The downloaded file is a .tar file with a size of 920381364 bytes. Unfortunately, I am encountering an issue when attempting to extract the files using the tar command. The specific error message that appears is "This does not look like a tar archive", indicating a possible file corruption or damage.
To help in troubleshooting this issue, could you kindly provide a MD5 checksum for the original file? This would allow us to verify the integrity of the downloaded file and ascertain whether it is indeed damaged or not.
Additionally, it would be beneficial if an alternative download method could be provided, apart from Google Drive. This may help circumvent any potential issues with the current download process.
Is "MSE LOSS is 0.001 or 0.000 from epoch 0 and 16000/183885 (9%)" right?
My training set is 180k+, testing set is 50+.
when set lambda as 0.05 :
Test epoch 0: Average losses: Loss: 2.970 | MSE loss: 0.001 | Bpp loss: 0.79 | Aux loss: 37.53
Test epoch 1: Average losses: Loss: 2.344 | MSE loss: 0.000 | Bpp loss: 0.74 | Aux loss: 8.57
Is this situation reasonable??
Thanks a lot!
hello author,when i train the model with N=128 and lamba=0.05,i just trained one epoch, and all the value in z_likehood are almost become 1.00000, is this a normal situation ? do you meet this situation before?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.