Coder Social home page Coder Social logo

diffusionad's People

Contributors

huizhang0812 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

diffusionad's Issues

python train.py Loss not decreasing

hi ,i add the data according to the readme document ,but when i run python train.py it shows

class screw
args1.json defaultdict(<class 'str'>, {'img_size': [256, 256], 'Batch_Size': 2, 'EPOCHS': 300, 'T': 1000, 'base_channels': 128, 'beta_schedule': 'linear', 'loss_type': 'l2', 'diffusion_lr': 0.0001, 'seg_lr': 1e-05, 'random_slice': True, 'weight_decay': 0.0, 'save_imgs': True, 'save_vids': False, 'dropout': 0, 'attention_resolutions': '32,16,8', 'num_heads': 4, 'num_head_channels': -1, 'noise_fn': 'gauss', 'channels': 3, 'mvtec_root_path': '/content/drive/MyDrive/DiffusionAD/datasets/mvtec', 'visa_root_path': 'datasets/VisA_1class/1cls', 'dagm_root_path': 'datasets/dagm', 'mpdd_root_path': 'datasets/mpdd', 'anomaly_source_path': '/content/drive/MyDrive/DiffusionAD/datasets/dtd', 'noisier_t_range': 600, 'less_t_range': 300, 'condition_w': 1, 'eval_normal_t': 200, 'eval_noisier_t': 400, 'output_path': 'outputs', 'arg_num': '1'})
/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:557: UserWarning: This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:557: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
Epoch:0, Train loss: nan: 1% 1/160 [00:04<12:14, 4.62s/it]thresh_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/DISthresh/good/309.png
image_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/train/good/309.png
Epoch:0, Train loss: nan: 1% 2/160 [00:06<08:03, 3.06s/it]thresh_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/DISthresh/good/151.png
image_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/train/good/151.png
thresh_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/DISthresh/good/023.png
image_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/train/good/023.png
thresh_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/DISthresh/good/180.png
image_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/train/good/180.png
Epoch:0, Train loss: nan: 2% 3/160 [00:08<06:13, 2.38s/it]thresh_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/DISthresh/good/015.png
image_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/train/good/015.png
thresh_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/DISthresh/good/292.png
image_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/train/good/292.png
Epoch:0, Train loss: nan: 2% 4/160 [00:09<05:21, 2.06s/it]thresh_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/DISthresh/good/113.png
image_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/train/good/113.png
thresh_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/DISthresh/good/152.png
image_path /content/drive/MyDrive/DiffusionAD/datasets/mvtec/screw/train/good/152.png

i print the data path of the image_path and thresh_path ,the path is right,but why loss can't decrease

When I run train.py, train_loss becomes nan. Is this problem normal?

class carpet
args1.json defaultdict(<class 'str'>, {'img_size': [256, 256], 'Batch_Size': 4, 'EPOCHS': 3000, 'T': 1000, 'base_channels': 128, 'beta_schedule': 'linear', 'loss_type': 'l2', 'diffusion_lr': 0.0001, 'seg_lr': 1e-05, 'random_slice': True, 'weight_decay': 0.0, 'save_imgs': True, 'save_vids': False, 'dropout': 0, 'attention_resolutions': '32,16,8', 'num_heads': 4, 'num_head_channels': -1, 'noise_fn': 'gauss', 'channels': 3, 'mvtec_root_path': 'datasets/mvtec', 'visa_root_path': 'datasets/VisA/visa', 'dagm_root_path': 'datasets/dagm', 'mpdd_root_path': 'datasets/mpdd', 'anomaly_source_path': 'datasets/dtd', 'noisier_t_range': 600, 'less_t_range': 300, 'condition_w': 1, 'eval_normal_t': 200, 'eval_noisier_t': 400, 'output_path': 'outputs', 'arg_num': '1'})
Epoch:0, Train loss: nan:

train_loss increases by two points from 2.03 to 4.23, sometimes becoming nan when it increases to 10, and sometimes becoming nan when it reaches 20

Is the whole network 2-stage training?

I mean, should I train the diffusion model first and after that train diffusion model + segmentation network together ? (2-stage training)
Or, I just train diffusion model + segmentation network together (1-stage training)

我的電腦batchsize只能設到3 其他地方要怎麼設定才能表現很好呢 我是用visa資料集

{
"img_size": [256,256],
"Batch_Size": 3,
"EPOCHS": 100,
"T": 1000,
"base_channels": 128,
"beta_schedule": "cosine",
"loss_type": "l2",
"diffusion_lr": 1e-10,
"seg_lr": 1e-5,
"random_slice": true,
"weight_decay": 0.0,
"save_imgs":true,
"save_vids":false,
"dropout":0,
"attention_resolutions":"32,16,8",
"num_heads":8,
"num_head_channels":-1,
"noise_fn":"gauss",
"channels":3,
"visa_root_path":"/home/anywhere3090l/Desktop/henry/DiffusionAD-main/visa",
"anomaly_source_path":"/home/anywhere3090l/Desktop/henry/DiffusionAD-main/dtd",
"noisier_t_range":600,
"less_t_range":300,
"condition_w":5,
"eval_normal_t":200,
"eval_noisier_t":400,
"output_path":"outputs"

}

Error in training Visa dataset

Hello, I downloaded the foregrounds for the Visa dataset in Goole Drive, but the number of images does not match, which caused an error in the training Visa dataset. Could you please upload the file again? Thank you very much

ValueError

Hello, I reported the error at the end of the training. The following is the error content
屏幕截图 2023-12-12 163430

模擬異常被用在哪邊

用正常樣本加上DTD模擬異常
有被放到訓練中嗎

因為DIFFUSION MODEL不是訓練只能是正常的
輸入異常會只能生出正常照片
算兩張照片的距離

我不懂模擬異常被用在哪邊
謝謝您

The role of thresh

Hello, what is the role of "thresh" in the training set in this project, and what should we pay attention to when making "thresh" samples?

模型的大小,模型训练时间

您好,请问我将batchsize设置为4(batchsize设置为2训练时间需要8天)就报以下错误:
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 23.70 GiB total capacity; 17.55 GiB already allocated; 145.56 MiB free; 17.79 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
请问是模型参数太多了吗?我自己的数据集每张图片大小只有40k。谢谢!

这是一项非常有前景的工作。我有一个疑问需要请教您?

这是一项非常有前景的工作。我有一个疑问需要请教您?
在去噪过程使用的一步式去噪范式,那么在正向加噪模拟扩散过程也同样是一步加噪吗?也就是说两个正向扩散训练过程中都只需要扩散一次,xts是直接从t=0一步加噪到ts来学习ϵθ(xts),xtb是直接从t=0一步加噪到tb来学习ϵθ(xtb)。请问是我现在理解的意思吗?期待您的解惑,谢谢!

Training is so slow

One epoch takes close to four hours

i'm using RTX4080 ro start training with vsia dataset,set the batch-size to 8.
Any idea to improve it?

Hello, a question about "img_size"

Hello, Zhanghui, thank you for your open-source code.

I have a question about the parameter of "img_size". How can I change "img_size": [256,256] to "img_size": [1060,1060], or "img_size": [1024,1024] ? Do I need to modify the original model? Can you give me some suggestions?

Error when running python train.py

hello. I've read your paper. I'm not good at writing code and have little experience with experiments, so I get an error when running python train.py.

I downloaded the suggested dataset mvtec dataset and saved it as a folder named mvtec in the data folder. After that, I went to Google Drive, downloaded the foreground file, and pasted it into the mvtec file. Afterwards, when you specify the DTD path and run python train.py, the following error message occurs. (Does not apply to visA dataset)

Since my skills are limited, I would really appreciate it if you could explain.

image

Below is the path and file status of the dataset I set.
image

Checkpoints for VisA dataset

Hi! First of all, thank you very much for sharing this fascinating work.

I was wondering whether you were planning to release the checkpoints of the trained models for (at least) the VisA dataset.
I would like to re-evaluate this work with another protocol to insert it into a benchmark, so to be as fair as possible with your results I would prefer to have the original weights.

Thank you again!

a question about syntheticanomalies

In paper, you say that you will get a pre-segmentation result F at first. But I don't find it in your code. Looking forward to your reply!
image

训练与推理时间

您好,请问可否告知硬件配置及相应的训练时长与推理速度?谢谢!

训练化纤数据集

你好我想问一下,如果想利用如下的化纤异常数据集进行测试,我该如何微调参数或模型来提高效果呢?时间步长设置为1000是默认为最佳效果吗?
2023-03-25_15-34-15-1373_camera2_1_0
d_0000250_2023-07-03-14-35-27_camera00_1_0
d_0012000_2023-04-25-18-41-21_camera01_3_0
d_0000500_2023-07-26-17-06-04_camera01_0_0

nan loss (noise loss)

Hi @HuiZhang0812 ,

I have noticed that if it is the case that all samples in a batch have anomalies (anomaly_label ==1) the loss computed within the funcion norm_guided_one_step_denoising of the DDPM.py file sets to nan:

line 359 -> loss = (normal_loss["loss"]+noisier_loss["loss"])[anomaly_label==0].mean() (because the .mean() method is calculated in a empty Tensor [])

And according to the equation (9) of your original paper

image

With the term (1 - y) it seems to be intended that if the sample is anomalous (y = 1), then Lnoise has to be 0. So for the code reflecting this, one posible solution is the following one:

after the line 359 (pasted above) we can include

if torch.isnan(loss):
    loss.fill_(0.0)

Thank you very much for your work,,

If there are any additional comments or adjustments needed, I would be happy to collaborate to make sure this issue is fully resolved. Also, if there are other contributors who would like to review the solution, your comments are welcome!

I appreciate your help in this matter and look forward to hearing your comments.

Best regards!
N

Training DAGM Dataset Error

class Class2
args1.json defaultdict(<class 'str'>, {'img_size': [256, 256], 'Batch_Size': 6, 'EPOCHS': 20, 'T': 1000, 'base_channels': 128, 'beta_schedule': 'linear', 'loss_type': 'l2', 'diffusion_lr': 0.0001, 'seg_lr': 1e-05, 'random_slice': True, 'weight_decay': 0.0, 'save_imgs': True, 'save_vids': False, 'dropout': 0, 'attention_resolutions': '32,16,8', 'num_heads': 4, 'num_head_channels': -1, 'noise_fn': 'gauss', 'channels': 3, 'mvtec_root_path': '/content/drive/MyDrive/DiffusionDA/DiffusionAD-main/datasets/mvtec', 'visa_root_path': '/content/drive/MyDrive/DiffusionDA/DiffusionAD-main/datasets/VisA_1class/1cls', 'dagm_root_path': '/content/drive/MyDrive/DiffusionDA/DiffusionAD-main/datasets/dagm', 'mpdd_root_path': '/content/drive/MyDrive/DiffusionDA/DiffusionAD-main/datasets/mpdd', 'anomaly_source_path': '/content/drive/MyDrive/DiffusionDA/DiffusionAD-main/datasets/DTD', 'noisier_t_range': 600, 'less_t_range': 300, 'condition_w': 1, 'eval_normal_t': 200, 'eval_noisier_t': 400, 'output_path': 'outputs', 'arg_num': '1'})
Traceback (most recent call last):
File "/content/drive/MyDrive/DiffusionDA/DiffusionAD-main/train.py", line 333, in
main()
File "/content/drive/MyDrive/DiffusionDA/DiffusionAD-main/train.py", line 315, in main
training_dataset_loader = DataLoader(training_dataset, batch_size=args['Batch_Size'],shuffle=True,num_workers=8,pin_memory=True,drop_last=True)
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 344, in init
sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/sampler.py", line 107, in init
raise ValueError("num_samples should be a positive integer "
ValueError: num_samples should be a positive integer value, but got num_samples=0

problems with the loss updating

Hi @HuiZhang0812 ,

First of all, thanks a lot for your work and for sharing it here!

I have realized one thing what I think could be a coding error. In train.py file, lines 136-141 the train losses are added to the epoch total loss:

train_loss += loss.item()
tbar.set_description('Epoch:%d, Train loss: %.3f' % (epoch, train_loss))

train_smL1_loss += smL1_loss.item()
train_focal_loss+=5*focal_loss.item()
train_noise_loss+=noise_loss.item()

I suppose this was for after, within lines 144-149, compute the mean of those epoch losses dividing them with the number of steps of the epoch (i). But it seems to have been forgotten, and it is being appended the sum of the epoch loss.

I think the following:

       if epoch % 10 ==0  and epoch > 0:
           train_loss_list.append(round(train_loss,3))
           train_smL1_loss_list.append(round(train_smL1_loss,3))
           train_focal_loss_list.append(round(train_focal_loss,3))
           train_noise_loss_list.append(round(train_noise_loss,3))
           loss_x_list.append(int(epoch))

Has to be replaced with:

       if epoch % 10 ==0  and epoch > 0:
           train_loss_list.append(round(train_loss / (i+1),3))
           train_smL1_loss_list.append(round(train_smL1_losss / (i+1),3))
           train_focal_loss_list.append(round(train_focal_losss / (i+1),3))
           train_noise_loss_list.append(round(train_noise_losss / (i+1),3))
           loss_x_list.append(int(epoch))

(to take advantage of the i variable, obviously there are alternative ways to achieve it).

In addition, these loss lists are not saved or logged anywhere afterwards (I guess this is a decision you are aware of).

Again, thank you very much for your great work in developing this SOTA algorithm and even more for sharing it with the community.

Does the model have to be trained on all classes in the entire dataset?

Because training on the entire MVTEC dataset is too time-consuming, I only trained the 'carpet' category.
This is my args.json, batchsize = 4,epochs = 1500,Different from the author's batchsize = 16 and epoch = 3000

{
  "img_size": [256,256],
  "Batch_Size": 4,
  "EPOCHS": 1500,
  "T": 1000,
  "base_channels": 128,
  "beta_schedule": "linear",
  "loss_type": "l2",
  "diffusion_lr": 1e-4,
  "seg_lr": 1e-5,
  "random_slice": true,
  "weight_decay": 0.0,
  "save_imgs":true,
  "save_vids":false, 
  "dropout":0,
  "attention_resolutions":"32,16,8",
  "num_heads":4,
  "num_head_channels":-1,
  "noise_fn":"gauss",
  "channels":3,
  "mvtec_root_path":"/workspace/DiffusionAD/MVTec-AD",
  "visa_root_path":"/workspace/DiffusionAD/VisA",
  "dagm_root_path":"/workspace/DiffusionAD/dagm",
  "mpdd_root_path":"/workspace/DiffusionAD/mpdd",
  "anomaly_source_path":"/workspace/DiffusionAD/dtd",
  "noisier_t_range":600,
  "less_t_range":300,
  "condition_w":1,
  "eval_normal_t":200,
  "eval_noisier_t":400,
  "output_path":"/workspace/DiffusionAD/outputs"

}

after 1500,train loss is 3.232
trans_loss

when i run eval.py,it will say missing key
image
so i changed the eval.py,(line 381)

        unet_model.load_state_dict(output["unet_model_state_dict"],strict=False)
        unet_model.to(device)

after this
image
It can be found that out_mask has better results, but recon_con is still a Gaussian image.

Isn’t this training a model per category? Why can Segmentation Sub-network training be successful, but Norm-guided One-step Denoising does not have good results?Or is it because I didn’t train enough rounds? Or must all categories be trained to avoid possible missing keys problems?

Looking forward to your reply

前景图片

你好,我想问一下想训练自己的数据集,如何获得前景图片。

Error

按要求修改了mvtec_root_path、anomaly_source_path、output_path训练还是会报错,而且在您给出的数据结构中MVTec-AD数据集中carpet类别里的thresh在数据集中未找到

Experiment on MVTec LOCO

Dear @HuiZhang0812 , your work is amazing.
Now I want to try DiffusionAD on dataset MVTec LOCO.
But I donot know if your method can work on this dataset.
Have your team ever tried this dataset?
Any suggestions for me if I want try?

Thanks for your reply ^-^

Excessive CUDA Memory Usage in UNet Code of Recon Module During Forward Function Process

https://github.com/HuiZhang0812/DiffusionAD/blob/80853e65cbe92677839cd093596d287e31f5e723/models/Recon_subnetwork.py#L398C13-L398C38

Dear @HuiZhang0812

I am encountering a significant issue with the UNet code within the Recon Module, specifically during the process of the forward function. The problem arises when updating the hidden embeddings, leading to an excessive consumption of CUDA memory.

While validating the code with a batch size of 16, I noticed that the memory in the VRAM accumulates continuously in this particular section. This issue results in the exhaustion of all 48GB of VRAM on my RTX A6000 GPU before a single model run is completed, leading to a CUDA Out Of Memory (OOM) error.

I would greatly appreciate your assistance in investigating and addressing this matter. Your prompt attention to this issue would be highly valued.

Thank you for your time and support.

Best regards,
Woojun Lee

Hello, what can be input to self.textural_foreground_path

Hello,I read the codes. and I want to train only the mvtec dataset,but I don't know what can be input to the path of "foreground_path +'/thresh'", is the ground_truth pictures?

"self.textural_foreground_path = sorted(glob.glob(foreground_path +"/thresh/*.png"))"

Thank you!

Hello, a question about image score

Hello, thank you for your open-source code. I train the dataset of the MvTecAD. I get the good performance of the anomaly images. but when I test the normal images, the image score is low but some region of the heatmap is high. What is the relationship between the image score and heatmap? How can I to reduce high heatmap of the normal images?

DiffisionAD实验记录_001

DAGM dataset

Excuse me, is there a suitable DAGM dataset and corresponding data tree for DiffusionAD

Questions about the paper

Hello, do you reconstruct with one step approach? The one-step approximation I achieved was poor.

Should we have a thresh directory in datasets?

Hello,
I have seen this line of code in the class dataset_beta_thresh.py : 'self.textural_foreground_path = sorted(glob.glob(foreground_path +"/thread/*.png"))' but since textural_foreground_path is 0 all the time I doubted whether the 'thread' is right or not. because I don't have such a directory, shouldn't it be 'thresh' instead?

我要用visa dataset 除了改path之外 還要改哪邊嗎

{
"img_size": [256,256],
"Batch_Size": 16,
"EPOCHS": 3000,
"T": 1000,
"base_channels": 128,
"beta_schedule": "linear",
"loss_type": "l2",
"diffusion_lr": 1e-4,
"seg_lr": 1e-5,
"random_slice": true,
"weight_decay": 0.0,
"save_imgs":true,
"save_vids":false,
"dropout":0,
"attention_resolutions":"32,16,8",
"num_heads":4,
"num_head_channels":-1,
"noise_fn":"gauss",
"channels":3,
"mvtec_root_path":"/home/anywhere3090l/Desktop/henry/jjmvtec/mvtec",
"visa_root_path":"/home/anywhere3090l/Desktop/henry/jjmvtec/VisA_dataset",
"dagm_root_path":"datasets/dagm",
"mpdd_root_path":"datasets/mpdd",
"anomaly_source_path":"datasets/DTD",
"noisier_t_range":600,
"less_t_range":300,
"condition_w":1,
"eval_normal_t":200,
"eval_noisier_t":400,
"output_path":"outputs"

}

Pre-trained model

Can you share your pre-trianed models which get the results in paper? Thank you.

torch.cuda.OutOfMemoryError:

Hello!

I have run your train.py code in google colab, but I keep getting this error below. I am already using A100 GPU in Colab Pro. Do you have any idea how to solve this one?

class carpet
args1.json defaultdict(<class 'str'>, {'img_size': [256, 256], 'Batch_Size': 8, 'EPOCHS': 3000, 'T': 1000, 'base_channels': 128, 'beta_schedule': 'linear', 'loss_type': 'l2', 'diffusion_lr': 0.0001, 'seg_lr': 1e-05, 'random_slice': True, 'weight_decay': 0.0, 'save_imgs': True, 'save_vids': False, 'dropout': 0, 'attention_resolutions': '32,16,8', 'num_heads': 4, 'num_head_channels': -1, 'noise_fn': 'gauss', 'channels': 3, 'mvtec_root_path': 'datasets/mvtec', 'visa_root_path': 'datasets/VisA_1class/1cls', 'dagm_root_path': 'datasets/dagm', 'mpdd_root_path': 'datasets/mpdd', 'anomaly_source_path': 'datasets/DTD', 'noisier_t_range': 600, 'less_t_range': 300, 'condition_w': 1, 'eval_normal_t': 200, 'eval_noisier_t': 400, 'output_path': 'outputs', 'arg_num': '1'})
0% 0/35 [00:16<?, ?it/s]
Traceback (most recent call last):
File "/content/drive/MyDrive/DiffusionDA/DiffusionAD-main/train.py", line 337, in
main()
File "/content/drive/MyDrive/DiffusionDA/DiffusionAD-main/train.py", line 332, in main
train(training_dataset_loader, test_loader, args, data_len,sub_class,class_type,device )
File "/content/drive/MyDrive/DiffusionDA/DiffusionAD-main/train.py", line 117, in train
noise_loss, pred_x0,normal_t,x_normal_t,x_noiser_t = ddpm_sample.norm_guided_one_step_denoising(unet_model, aug_image, anomaly_label,args)
File "/content/drive/MyDrive/DiffusionDA/DiffusionAD-main/models/DDPM.py", line 354, in norm_guided_one_step_denoising
noisier_loss, x_noiser_t, estimate_noise_noisier = self.calc_loss(model, x_0, noisier_t)
File "/content/drive/MyDrive/DiffusionDA/DiffusionAD-main/models/DDPM.py", line 333, in calc_loss
estimate_noise = model(x_t, t)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/content/drive/MyDrive/DiffusionDA/DiffusionAD-main/models/Recon_subnetwork.py", line 403, in forward
h = module(h, time_embed)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/content/drive/MyDrive/DiffusionDA/DiffusionAD-main/models/Recon_subnetwork.py", line 32, in forward
x = layer(x, emb)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/content/drive/MyDrive/DiffusionDA/DiffusionAD-main/models/Recon_subnetwork.py", line 217, in forward
return self.skip_connection(x) + h
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 39.56 GiB total capacity; 38.73 GiB already allocated; 8.81 MiB free; 38.99 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.