jolibrain / joligen Goto Github PK

View Code? Open in Web Editor NEW

200.0 200.0 29.0 13.54 MB

Generative AI Image Toolset with GANs and Diffusion for Real-World Applications

Home Page: https://www.joligen.com

License: Other

Python 97.82% Shell 1.14% C++ 0.10% Cuda 0.94%

augmented-reality deep-learning diffusion-models gan generative-model image-generation image-to-image pytorch

joligen's People

Contributors

Stargazers

Watchers

joligen's Issues

A question about training on weather conditions

Hello,

I ran a training from clear to snowy on BDD100K. After epoch 15, the loss is almost stable. Is this normal?

scripts to run cyclegan_sty2_model

Hi, thanks for sharing this package. Do you have the script to run the cyclegan_sty2_model? It will be nice to have a similar script where many detailed parameters were given in semantic. Thanks.

Example of a json file for training

Hi,

The documentation mentions that we can use the option --config_json with the configuration for training. Could you please share an example of the json file?

Thanks!

Output image size when applying a model (inference)

Hi,

Which option shall I use in order to maintain the output image size identical to the input image size when applying a model?

Thank you.

inference unetref generator

trying to run with unetref generator checkpoint trained with config python3 scripts/gen_single_image_diffusion.py \ --model-in-file latest_net_G_A.pth \ --img-in viton_bbox_ref/testA/imgs/00006_00.jpg \ --mask-in viton_bbox_ref/testA/ref/00006_00.jpg \ --dir-out checkpoints/viton_bbox_ref/inference_output \ --img-width 128 \ --img-height 128

getting the following error

  warnings.warn(
Dual U-Net: number of ref blocks:  15
sampling loop time step:   0%|                                                                                                                                                                               | 0/1000 [00:00<?, ?it/s]
  0%|                                                                                                                                                                                                           | 0/1 [00:03<?, ?it/s]
Traceback (most recent call last):
  File "/joliGEN/scripts/gen_single_image_diffusion.py", line 808, in <module>
    frame, lmodel, lopt = generate(**vars(args))
                          ^^^^^^^^^^^^^^^^^^^^^^
  File "/joliGEN/scripts/gen_single_image_diffusion.py", line 563, in generate
    out_tensor, visu = model.restoration(
                       ^^^^^^^^^^^^^^^^^^
  File "/joliGEN/models/modules/diffusion_generator.py", line 95, in restoration
    return self.restoration_ddpm(
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/joliGEN/models/modules/diffusion_generator.py", line 149, in restoration_ddpm
    y_t = self.p_sample(
          ^^^^^^^^^^^^^^
  File "/joliGEN/models/modules/diffusion_generator.py", line 253, in p_sample
    model_mean, model_log_variance = self.p_mean_variance(
                                     ^^^^^^^^^^^^^^^^^^^^^
  File "/joliGEN/models/modules/diffusion_generator.py", line 219, in p_mean_variance
    noise=self.denoise_fn(
          ^^^^^^^^^^^^^^^^
  File "/joliGEN/venv_joli/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/joliGEN/models/modules/palette_denoise_fn.py", line 109, in forward
    out = self.model(input, embedding, ref)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/joliGEN/venv_joli/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/joliGEN/models/modules/unet_generator_attn/unet_generator_attn.py", line 1605, in forward
    h, hs, emb, h_ref, hs_ref = self.compute_feats(
                                ^^^^^^^^^^^^^^^^^^^
  File "/joliGEN/models/modules/unet_generator_attn/unet_generator_attn.py", line 1595, in compute_feats
    h, _ = module(h, emb, qkv_ref=qkv_list.pop(0))
                                  ^^^^^^^^
UnboundLocalError: cannot access local variable 'qkv_list' where it is not associated with a value```

Warning on torch transforms with semantics

Setting up a new session...
create web directory /data1/beniz/models/gan_checkpoints/face_masks_cut_3/web...
/usr/local/lib/python3.6/dist-packages/torchvision/transforms/functional.py:365: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
  "Argument interpolation should be of type InterpolationMode instead of int. "
/usr/local/lib/python3.6/dist-packages/torchvision/transforms/functional.py:365: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
  "Argument interpolation should be of type InterpolationMode instead of int. "
/usr/local/lib/python3.6/dist-packages/torchvision/transforms/functional.py:365: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
  "Argument interpolation should be of type InterpolationMode instead of int. "
/usr/local/lib/python3.6/dist-packages/torchvision/transforms/functional.py:365: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
  "Argument interpolation should be of type InterpolationMode instead of int. "

I believe we should fix this so that we are certain the interpolation method is the right one, especially when resizing masks with nearest neighbors.

Preserving the semantics during translation

Hi,

I wonder what translation model should be used to maximize the semantic correctness during the translation. Any particular architecture, or discriminator that should be used?

Thanks!

Using TPUs

Hello,

If we plan to use TPUs instead of GPUs, is it possible with the current config or shall we use a different configuration?

Thanks

Import images with paths.txt file in unaligned mode

Hello,

I have a training in unaligned mode. As I understand, the structure of the dataset directory should be like this :

domainA2domainB/
    trainA/
        image_a1.jpg
        image_a2.jpg
        ...
    trainB/
        image_b1.jpg
        image_b2.jpg
        ...

I would be interested in importing images using paths.txt files as follows:

domainA2domainB/
    trainA/
        paths.txt
    trainB/
        paths.txt

For now it is not working and it will be interesting given my infra. Is it possible to add this feature without too much effort?

Resuming a CUT model fails

I get this error:

initialize network with normal
initialize network with normal
model [CUTSemanticMaskModel] was created
loading the model from /data1/beniz/models/gan_checkpoints/face_masks_cut_2/latest_net_G.pth
loading the model from /data1/beniz/models/gan_checkpoints/face_masks_cut_2/latest_net_F.pth
Traceback (most recent call last):
  File "train.py", line 35, in <module>
    model.setup(opt)               # regular setup: load and print networks; create schedulers
  File "/home/beniz/projects/deepdetect/dev/joliGAN/models/base_model.py", line 137, in setup
    self.load_networks(load_suffix)
  File "/home/beniz/projects/deepdetect/dev/joliGAN/models/base_model.py", line 254, in load_networks
    self.__patch_instance_norm_state_dict(state_dict, net, key.split('.'))
  File "/home/beniz/projects/deepdetect/dev/joliGAN/models/base_model.py", line 230, in __patch_instance_norm_state_dict
    self.__patch_instance_norm_state_dict(state_dict, getattr(module, key), keys, i + 1)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 948, in __getattr__
    type(self).__name__, name))
AttributeError: 'PatchSampleF' object has no attribute 'mlp_0'

Probably some of the network parameters that are not saved.

BUG: AttributeError: 'float' object has no attribute 'clone'

On the current master I am getting the following error consistently:

Traceback (most recent call last):
File "/workspace/joliGEN/train.py", line 445, in
launch_training(opt)
File "/workspace/joliGEN/train.py", line 419, in launch_training
mp.spawn(
File "/usr/local/lib/python3.10/dist-packages/torch/multiprocessing/spawn.py", line 246, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method="spawn")
File "/usr/local/lib/python3.10/dist-packages/torch/multiprocessing/spawn.py", line 202, in start_processes
while not context.join():
File "/usr/local/lib/python3.10/dist-packages/torch/multiprocessing/spawn.py", line 163, in join
raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/torch/multiprocessing/spawn.py", line 74, in wrap
fn(i, *args)
File "/workspace/joliGEN/train.py", line 197, in train_gpu
model.optimize_parameters() # calculate loss functions, get gradients, update network weights
File "/workspace/joliGEN/models/base_model.py", line 1138, in optimize_parameters
self.compute_step(group.optimizer, loss_names)
File "/workspace/joliGEN/models/base_model.py", line 1029, in compute_step
getattr(self, "loss" + loss_name).clone()
AttributeError: 'float' object has no attribute 'clone'

root@3c5c0934a4de:/workspace/joliGEN#

AttributeError: 'float' object has no attribute 'clone'

Here is my config:
{
"D": {
"dropout": false,
"n_layers": 3,
"ndf": 64,
"netDs": [
"projected_d",
"basic"
],
"norm": "instance",
"proj_interp": 1024,
"proj_network_type": "efficientnet"
},
"G": {
"attn_nb_mask_attn": 10,
"attn_nb_mask_input": 1,
"dropout": false,
"nblocks": 9,
"netG": "mobile_resnet_attn",
"ngf": 64,
"norm": "instance",
"padding_type": "reflect"
},
"alg": {
"gan": {
"lambda": 1.0
},
"cut": {
"HDCE_gamma": 1.0,
"HDCE_gamma_min": 1.0,
"MSE_idt": false,
"flip_equivariance": false,
"lambda_MSE_idt": 1.0,
"lambda_NCE": 1.0,
"lambda_SRC": 0.0,
"nce_T": 0.07,
"nce_idt": true,
"nce_includes_all_negatives_from_minibatch": false,
"nce_layers": "0,4,8,12,16",
"nce_loss": "monce",
"netF": "mlp_sample",
"netF_dropout": false,
"netF_nc": 256,
"netF_norm": "instance",
"num_patches": 256
}
},
"data": {
"crop_size": 256,
"dataset_mode": "unaligned",
"direction": "AtoB",
"load_size": 256,
"max_dataset_size": 1000000000,
"num_threads": 4,
"preprocess": "resize_and_crop"
},
"output": {
"display": {
"freq": 400,
"id": 1,
"ncols": 0,
"type": [
"visdom"
],
"visdom_port": 8097,
"visdom_server": "http://localhost",
"winsize": 256
},
"no_html": false,
"print_freq": 100,
"update_html_freq": 1000,
"verbose": false
},
"model": {
"init_gain": 0.02,
"init_type": "normal",
"input_nc": 3,
"multimodal": false,
"output_nc": 3
},
"train": {
"D_lr": 0.0001,
"G_ema": false,
"G_ema_beta": 0.999,
"G_lr": 0.0002,
"batch_size": 4,
"beta1": 0.9,
"beta2": 0.999,
"continue": false,
"epoch": "latest",
"epoch_count": 1,
"export_jit": false,
"gan_mode": "lsgan",
"iter_size": 8,
"load_iter": 0,
"metrics_every": 1000,
"n_epochs": 200,
"n_epochs_decay": 100,
"nb_img_max_fid": 1000000000,
"optim": "adam",
"pool_size": 50,
"save_by_iter": false,
"save_epoch_freq": 1,
"save_latest_freq": 5000
},
"dataaug": {
"APA": false,
"APA_every": 4,
"APA_nimg": 50,
"APA_p": 0,
"APA_target": 0.6,
"D_diffusion": false,
"D_diffusion_every": 4,
"D_label_smooth": false,
"D_noise": 0.0,
"affine": 0.0,
"affine_scale_max": 1.2,
"affine_scale_min": 0.8,
"affine_shear": 45,
"affine_translate": 0.2,
"diff_aug_policy": "",
"diff_aug_proba": 0.5,
"imgaug": false,
"no_flip": false,
"no_rotate": true
},
"checkpoints_dir": "/root/joliGEN/checkpoints",
"dataroot": "/root/joliGEN/datasets/ecommerce",
"ddp_port": "12355",
"gpu_ids": "0",
"model_type": "cut",
"name": "ecommerce",
"phase": "train",
"test_batch_size": 1,
"warning_mode": false,
"with_amp": false,
"with_tf32": false,
"with_torch_compile": false
}

I have tested in multiple environments with both an rtx 3090 and an rtx 4090 and the error is consistent. If I roll back to this commit: 811ba3d, then I can train just fine.

Training with semantic segmentation error "NameError: name 'N_img_path' is not defined"

Hi, i got a problem during my training using semantic segmentation.

After i prepared the dataset for training using semantic segmentation i got the following error : "NameError: name 'N_img_path' is not defined". I used the following line command :
python train.py --dataroot /home/shared/Synth2real_sem_seg --checkpoints_dir /home/shared/checkpoint_model --name Oktalse2bdd100k --config_json ./config_file_lib/synth2real_sem_seg.json --output_display_aim_server 127.0.0.1 --output_display_visdom_port 8501 --gpu_ids 0,1,2,3 --train_semantic_mask --data_relative_paths

My config file is close to the one in "examples" directories : "example_gan_glasses2noglasses.json"

Do you know why this error occurred ?

How to train diffusion model on paired i2i dataset like style transfer?

All the examples configs are about inpainting problem with palette model

Error when loading the dataset with Release 1.0.0

Hello,

I tried to train using the Release 1.0.0 but I have this issue when loading the dataset, and then the training freezes.

The error: list index out of range domain B data loading for /home/........./snowy/imgs/0ce5101a-36562e6a.jpg

The same command works well with the code version from 1st August 2023.

Are their some options to change for using Release 1.0.0?

Thank you

What are the training times for the examples in JoliGEN training?

What GPU time is needed to achieve results shown in example results?

Adding logo(s) in AI-generated shirt

I was able to generate Shirt using AI (Virtual Try-on). Now I want to add my own logo on that shirt. how can I do it? The logo must be accurate.

Unet_mha_ref_attn run training example

Hello! Thanks for your work. Could you please help me understand how I can run training of unet_mha_ref_attn from this pr on viton dataset?

The 'ref-in' parameter is indeed mandatory.

Hello,

I was following guideline on how to use DDPM model on the VITON-HD dataset.

Once I have reached the following command:

mkdir -p ~/inferences
cd ~/joliGEN/scripts
python3 gen_single_image_diffusion.py \
     --model-in-file ~/checkpoints/VITON-HD/latest_net_G_A.pth \
     --img-in ~/datasets/VITON-HD/testA/imgs/00006_00.jpg \
     --mask-in ~/datasets/VITON-HD/testA/mask/00006_00.png \
     --dir-out ~/inferences \
     --nb_samples 4 \
     --img-width 256 \
     --img-height 256

 I got the following exception that ref variable is used before actually being defined. I have checked the code and the issue happens here (method called generate ) 
 
    if ref_in:
      ref = cv2.imread(ref_in)
      ref_orig = ref.copy()
      ref = cv2.cvtColor(ref, cv2.COLOR_BGR2RGB)
      
      the issue is that ref_in is None and ref is not actually gets defined. 
      
      I have checked the main and there is the following piece of code:

              options.parser.add_argument(
        "--ref-in", help="image used as reference", required=False
    )

It looks like it should not be required, but without passing --ref-in param into initial command it does not work properly.

Using `--sampling-steps` in 'gen_single_image_diffusion.py` breaks the inference

The scheduler is lacking from the model (or should be gotten from elsewhere):

python3 gen_single_image_diffusion.py --model-in-file /path/to/model/latest_net_G_A.pth --img-width 128 --img-height 128 --gpuid 3 --img-in /path/to/img.jpg --dir-out /data1/beniz/data/test_ddim/

returns

Traceback (most recent call last):
  File "gen_single_image_diffusion.py", line 794, in <module>
    generate(**vars(args))
  File "gen_single_image_diffusion.py", line 189, in generate
    model, opt = load_model(
  File "gen_single_image_diffusion.py", line 96, in load_model
    model.denoise_fn.beta_schedule["test"]["n_timestep"] = sampling_steps
  File "/home/beniz/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1614, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'PaletteDenoiseFn' object has no attribute 'beta_schedule'

Exception in dataloader.py

Hello, I am trying to train clear2snowy. I have the following exception

-- Process 1 terminated with the following error:
Traceback (most recent call last):
File "/opt/conda/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
fn(i, *args)
File "train.py", line 125, in train_gpu

The training command is as follows:

python train.py --dataroot datasets/clear2snowy/ --checkpoints_dir checkpoints/clear2snowy --name clear2snowy --output_display_env clear2snowy --output_display_freq 20 --output_print_freq 20 --train_G_lr 0.0002 --train_D_lr 0.0001 --data_crop_size 512 --data_load_size 512 --data_dataset_mode unaligned_labeled_mask_online --model_type cut --train_batch_size 2 --train_iter_size 4 --model_input_nc 3 --model_output_nc 3 --f_s_net segformer --f_s_config_segformer models/configs/segformer/segformer_config_b0.py --train_mask_f_s_B --f_s_semantic_nclasses 11 --G_config_segformer models/configs/segformer/segformer_config_b0.json --data_online_creation_crop_size_A 512 --data_online_creation_crop_delta_A 64 --data_online_creation_mask_delta_A 64 --data_online_creation_crop_size_B 512 --data_online_creation_crop_delta_B 64 --dataaug_D_noise 0.01 --data_online_creation_mask_delta_B 64 --alg_cut_nce_idt --train_sem_use_label_B --D_netDs projected_d basic vision_aided --D_proj_interp 512 --D_proj_network_type vitsmall --train_G_ema --G_padding_type reflect --train_optim adam --dataaug_no_rotate --train_sem_idt --model_multimodal --train_mm_nz 16 --G_netE resnet_512 --f_s_class_weights 1 10 10 1 5 5 10 10 30 50 50 --output_display_aim_server 127.0.0.1 --output_display_visdom_port 8501 --gpu_id 0,1 --G_netG unet_256

problem training on master

After I install the last version of the code, I tried to run a training using the following cmd :
"python train.py --dataroot /home/shared/seg_sem_2 --checkpoints_dir /home/shared/checkpoint_model --name oktal2bdd --config_json /home/ubuntu/seg_sem.json --output_display_aim_server 127.0.0.1 --output_display_visdom_port 8501 --gpu_ids 0,1,2,3 --train_semantic_mask --data_relative_paths"
I encounter this problem :

My data is structure as the following :
-seg_sem_2 # name of dataset
-trainA # Synthetic Oktalse
-bbox # boundingbox png format
-imgs # images png format
-paths.txt # link imgs/bbox
-trainB # realistic BDD100K
-bbox # boundingbox png format
-imgs # images jpg format
-paths.txt

Diffusion Model

How to use diffusion models for paired image to image translation, like zebra to horse and compare with existing GAN models?

Problem using code to generate image from models

Hello,

I tried to generate image using the gen_single_image.py file in "/joliGAN/scripts/" but i got the following error of import on "networks" function from "model" package

So i check where the "networks" function is called in the code and it seems like it should be refered as gan_networks instead.
after swapping networks by gan_networks, i got another import error .on

I check in models and find out that cut_semantic_mask_model is more likely called re_cut_semantic_mask_model .

my current goal is to use a model loaded from the UI of joligan and use it on a sample of bdd100k images.
I would like to know if i am using the appropriate repository.

Thanks for your attention.

Issues encounter while training model on 4 gpus

Hello,
I have a problem for the training of a model during the training on 3 and 4 gpus, (working when using 1 or 2 gpus).
I am currently using a VM for the traing but it bugged when i start a training with more than 3 gpus. Did u also encounter this problem? and is there a way to fix it by changing a specific parameter in the training file?

PS: I'am not using the last version of the code

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

	OpenMMLab 1.0 branch	OpenMMLab 2.0 branch
MMEngine		0.x
MMCV	1.x	2.x
MMDetection	0.x 、1.x、2.x	3.x
MMAction2	0.x	1.x
MMClassification	0.x	1.x
MMSegmentation	0.x	1.x
MMDetection3D	0.x	1.x
MMEditing	0.x	1.x
MMPose	0.x	1.x
MMDeploy	0.x	1.x
MMTracking	0.x	1.x
MMOCR	0.x	1.x
MMRazor	0.x	1.x
MMSelfSup	0.x	1.x
MMRotate	1.x	1.x
MMYOLO		0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

problème run docker

Hi,
I have some questions about how to build the Dockerfile.
I tried in a first step to build both "Dockerfile.build" and "Dockerfile.server" files, the "build" one build correctly however when i try to run it. It close directly, is it normal ?
Moreover i can't build the Dockerfile.server because of credential. I have the credential but i don't know how to put it in the code and if i try to connect using the url : "https://docker.joligan.com/v2/joligan_build/manifests/latest". I end up with :
"{"errors":[{"code":"MANIFEST_UNKNOWN","message":"manifest unknown","detail":{"Tag":"latest"}}]}"
Can u help me to build and run correctly those Dockerfile?

unet ref training preview and inference script

after 500 epochs results of training preview do not match results of generated image with inference script

I tried finetuning previous checkpoints trained without data_online_creation_load_size_A option python3 train.py \ --dataroot /datasets/viton_ref/viton_bbox_ref \ --checkpoints_dir /checkpoints \ --name viton_bbox_ref \ --config_json examples/example_ddpm_unetref_viton.json \ --data_online_creation_load_size_A 768 1024 \ --train_continue \

then I tried inference python3 scripts/gen_single_image_diffusion.py \ --model-in-file /checkpoints/viton_bbox_ref/latest_net_G_A.pth \ --img-in /datasets/viton_ref/viton_bbox_ref/trainA/imgs/00000_00.jpg \ --bbox-in /datasets/viton_ref/viton_bbox_ref/trainA/bbox/00000_00.txt \ --ref-in /datasets/viton_ref/viton_bbox_ref/trainA/ref/00000_00.jpg \ --dir-out /checkpoints/viton_bbox_ref/inference_output \ --img-width 128 \ --img-height 128

I also tried inference with 96 128 (did not help improove results) and 512 512 (that approximatelly taking 6 hours)

Advice about training result

Hi, I have started a training recently and got some result which are not that great on the synthetic to realistic image style transfert.
I use the master branch of the code and got those result on the loss function.

jolibrain / joligen Goto Github PK

joligen's People

Contributors

Stargazers

Watchers

Forkers

joligen's Issues

Welcome update to OpenMMLab 2.0

Recommend Projects

Recommend Topics

Recommend Org