laion-ai / dalle2-laion Goto Github PK

View Code? Open in Web Editor NEW

498.0 498.0 65.0 6.6 MB

Pretrained Dalle2 from laion

Jupyter Notebook 42.59% Python 57.41%

dalle2-laion's People

Contributors

Stargazers

Watchers

Forkers

thomasthebear afiaka87 harubaru techthiyanes marcus-arcadius xiebaoshi mmtmn pbruneau yanting-k heroku-miraheze rasah673 prajjwalg seanbackstrom xiongjun19 1jsingh thekezzy liycg cyber-handle-enterprise destartad lycexe ftkyaoyuan joolstorrentecalo eltociear jung-alen etells pysync efrenaragon96 fastflair linhuaiyi wizyke assassindesign tazcore metamorphart intwanghao efenocchi maybelx jangocheng ilovelx hugoross3000 4dajkong greenbaum azure-arc-0 wangjiuniu ffgg10-29 leoxing1996 afareed007 lifeisflow0 igoryunusov huy521987 hang-zou chronobotorg cleancoindev mikebaumgart ganeshrajendrans z-w-wang david20080125 mavila111 5l1v3r1 bigdatasciencegroup magneto903 eric-lehnsher sidl419 albro96

dalle2-laion's Issues

Train DALL-E2 on my own dataset

Hi, May I know if there is a complete training script to train the DALL-E2 on my own dataset?

requirement 😀

Maby a requirement.py would be better. Thanks 😀

How to finetune with the pretrained model？

Thanks for your amazing job !

I am trying to load the pretrained model and finetune it with the training code from dalle2-pytorch on my own data，but I had some problems that bothered me.

After the model loads, I don't run the training code, but directly use the function to save the model weights，just want to verify that the whole process works, But when I replaced original decoder model with my saved model for forward inference, the resulting image became random noise.

From what I know, dalle2-laion is trained using dalle2-pytorch code, why can't I reproduce this process, is there any step I am missing?

Notebook still using dalle2_pytorch==0.15

An update to the notebook would be nice, since the version it installs does not have inpainting code yet. The checkpoints aren't directly compatible, so I'm guessing this will take a while.

They're almost compatible. *.gamma values are changed to *.g, and the noise_scheduler variables are now children of the module noise_scheduler instead of at the top level of 'model'. There are other things that are preventing loading the older checkpoints, though, and this is a big reason the major version number increased. If you skip optimizer loading and some of the other things you may be able to update the inference model at least.

After updating everything I could find, I found that "noise_scheduler.p2_loss_weight", and "net.null_text_embed" are missing. I'm going to play around with it for a bit and see if I can get it loading right.

Error in colab

I run the notebook and in the "rerank" section I'm getting this:

---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

[<ipython-input-11-cf76c2be222f>](https://localhost:8080/#) in <module>()
     84     img.save(f"./reranked_output/example_{index}.png")
     85 
---> 86 rerank_test(5, 50)

1 frames

[<ipython-input-7-eb7fb2d7bc08>](https://localhost:8080/#) in format_image_grid(img_array)
    151   rows = max_decoder_index + 1
    152 
--> 153   w, h = example_image.size
    154   grid = Image.new('RGB', size=(cols*w, rows*h))
    155   grid_w, grid_h = grid.size

TypeError: cannot unpack non-iterable builtin_function_or_method object

404 for "Official" HF Repo link in README

The "Official" HF Repo link in README (pointing at the URL below) results in an HTTP 404 error.

https://huggingface.co/laion/DALLE2-PyTorch

All Models Out of Date(Possible downgrade feature?)

All the current models get downloaded for 1.1.0, but the latest version is 1.4.6. It would be great to have an option to downgrade the version of DALLE2-pytorch so these are compatible for the current moment!

Error message after install

Hi I get the following error message on Ubuntu 22.04 under WSL2:

python example_inference.py test
Checksum: https://huggingface.co/laion/DALLE2-PyTorch/raw/main/decoder/v1.0.2/latest.pth
Checksum: https://huggingface.co/Veldrovive/upsamplers/raw/main/working/latest.pth
Checksum: https://huggingface.co/laion/DALLE2-PyTorch/raw/main/prior/latest.pth
Checksum mismatch. Deleting models/new_decoder.pth and downloading again.
WARNING: This decoder was trained on an old version of Dalle2. This may result in the model failing to load or it may lead to producing garbage results.
Killed

I am using Dalle2-PyTorch 1.1.0.

HW Requirements for running example_inference.py

Thanks a lot for sharing this great resource.
I am trying to run example_inference.py but it appears to get stuck after resource are downloaded and following message is printed:
"WARNING: This decoder was trained on an old version of Dalle2. This may result in the model failing to load or it may lead to producing garbage results."
What HW do I need to run example_inference?
My goal is to generate some images out of text: is example_inference.py the right approach for this?
Many Thanks

Cuda:1 doesnt work

If I set
"devices": "cuda:1"
in config json, the model is still loaded to device 0 and the error that tensors are on different devices in calculations occurs.

Colab Notebook Error: KeyError: 'decoder_path'

RuntimeError: Model ViT-L/14 not found

Running the default example_inference.py results in a runtime error due to a specified model that does not exist. The full traceback is

Traceback (most recent call last):
  File "example_inference.py", line 151, in <module>
    inference(obj={})
  File "/home/dev/.local/lib/python3.8/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/dev/.local/lib/python3.8/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/dev/.local/lib/python3.8/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/dev/.local/lib/python3.8/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/dev/.local/lib/python3.8/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/dev/.local/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "example_inference.py", line 43, in dream
    dreamer: BasicInference = BasicInference.create(model_config, verbose=verbose)
  File "/home/dev/deepLearning/dalle2-laion/dalle2_laion/scripts/InferenceScript.py", line 29, in create
    model_manager = DalleModelManager(config)
  File "/home/dev/deepLearning/dalle2-laion/dalle2_laion/dalle2_laion.py", line 103, in __init__
    self.clip = model_load_config.clip.create()
  File "/home/dev/.local/lib/python3.8/site-packages/dalle2_pytorch/train_configs.py", line 119, in create
    return OpenAIClipAdapter(self.model)
  File "/home/dev/.local/lib/python3.8/site-packages/dalle2_pytorch/dalle2_pytorch.py", line 281, in __init__
    openai_clip, preprocess = clip.load(name)
  File "/home/dev/.local/lib/python3.8/site-packages/clip_anytorch-2.2.1-py3.8.egg/clip/clip.py", line 115, in load
    raise RuntimeError(f"Model {name} not found; available models = {available_models()}")
RuntimeError: Model ViT-L/14 not found; available models = ['RN50', 'RN101', 'RN50x4', 'RN50x16', 'ViT-B/32', 'ViT-B/16']

This is pulled directly from the Hugging Face repo, so that one likely needs to be corrected. I will update here if I figure out a workaround.

Can the inference script be somehow modified to consume less than 12 GB RAM?

Otherwise it will crash with Colab free tier, and won't even download the prior.

example_inference produces noise

Hi. Thank you for your sharing.
I tried to run the python example_inference.py dream
but it would always give me a warning WARNING: This decoder was trained on version 1.1.0 but the current version is 1.11.1. This may result in the model failing to load. (although I definitely installed version 1.1.0 and not 1.11.1))
and it does produce garbage results:

Is there a way around this?
Thank you

RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

i don't have enough VRAM, when i change to use cpu device , there is an error:

WARNING: This decoder was trained on an old version of Dalle2. This may result in the model failing to load or it may lead to producing garbage results.
WARNING: This prior was trained on an old version of Dalle2. This may result in the model failing to load or it may produce garbage results.
Traceback (most recent call last):
File "G:/Project/Paint/dalle2-laion/test00.py", line 27, in
image = inference.run("Hello World")
File "G:/Project/Paint/dalle2-laion/test00.py", line 14, in run
image_embedding_map = self._sample_prior(text)
File "G:\Project\Paint\dalle2-laion\dalle2_laion\scripts\InferenceScript.py", line 270, in _sample_prior
embeddings = prior.sample(text_batch, cond_scale=cond_scale, num_samples_per_batch=num_samples_per_batch)
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\dalle2_pytorch\dalle2_pytorch.py", line 95, in inner
out = fn(model, *args, **kwargs)
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\dalle2_pytorch\dalle2_pytorch.py", line 1205, in sample
text_embed, text_encodings = self.clip.embed_text(text)
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\dalle2_pytorch\dalle2_pytorch.py", line 328, in embed_text
text_embed = self.clip.encode_text(text)
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\clip\model.py", line 350, in encode_text
x = self.transformer(x)
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\clip\model.py", line 204, in forward
return self.resblocks(x)
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\torch\nn\modules\container.py", line 204, in forward
input = module(input)
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\clip\model.py", line 191, in forward
x = x + self.attention(self.ln_1(x))
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\clip\model.py", line 188, in attention
return self.attn(x, x, x, need_weights=False, attn_mask=attn_mask)[0]
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\torch\nn\modules\activation.py", line 1174, in forward
attn_mask=attn_mask, average_attn_weights=average_attn_weights)
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\torch\nn\functional.py", line 5046, in multi_head_attention_forward
q, k, v = _in_projection_packed(query, key, value, in_proj_weight, in_proj_bias)
File "D:\InstallPath\Develop\Anaconda3\2020.07\envs\Dalle2_laion_3.7\lib\site-packages\torch\nn\functional.py", line 4737, in in_projection_packed
return linear(q, w, b).chunk(3, dim=-1)
RuntimeError: "addmm_impl_cpu" not implemented for 'Half'

Process finished with exit code 1

Share wandb links

colab SwinR not working

Hi,

Thanks for sharing the pretrained models with us, I am very grateful for your contribution.

When running the dalle2_laion_alpha.ipynb notebook, I encountered the following error.

Any suggestion for this?

Thank you!

cannot run colab directly

Hi,

TypeError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/numpy/lib/shape_base.py in split(ary, indices_or_sections, axis)
866 try:
--> 867 len(indices_or_sections)
868 except TypeError:

TypeError: object of type 'int' has no len()

When I run this jupyter notebook in colab, the last part (rerank) except the above exceptions.

Requirements in terms of VRAM?

Hi! How much VRAM is needed for inference?

colab broken

pls fix

Run on cpu

Hey, can you run this without a nvidia gpu (preferably on a cpu)?

TypeError: Multiple inheritance with NamedTuple is not supported

Hello I'm a newby. I am trying to run the code example_inference.py with dream command and I enter a text string to generate the image, but this error comes out:

Traceback (most recent call last):
File "C:\Users\Tullio\Desktop\dalle2\example_inference.py", line 1, in
from dalle2_laion import DalleModelManager, ModelLoadConfig, utils
File "C:\Users\Tullio\Desktop\dalle2\dalle2_laion_init_.py", line 1, in
from dalle2_laion.dalle2_laion import DalleModelManager
File "C:\Users\Tullio\Desktop\dalle2\dalle2_laion\dalle2_laion.py", line 73, in
class ModelInfo(NamedTuple, Generic[ModelType]):
File "C:\Users\Tullio\AppData\Local\Programs\Python\Python39\lib\typing.py", line 1929, in _namedtuple_mro_entries
raise TypeError("Multiple inheritance with NamedTuple is not supported")
TypeError: Multiple inheritance with NamedTuple is not supported

Do you have any suggestions?

If I modify line 73 of the file dalle2_laion.py in
class ModelInfo(Generic[ModelType]):
The code continues and the error disappears.
However this other error comes out:

Traceback (most recent call last):
File "C:\Users\Tullio\Desktop\dalle2\example_inference.py", line 151, in
inference(obj={})
File "C:\Users\Tullio\AppData\Local\Programs\Python\Python39\lib\site-packages\click\core.py", line 1130, in call
return self.main(*args, **kwargs)
File "C:\Users\Tullio\AppData\Local\Programs\Python\Python39\lib\site-packages\click\core.py", line 1055, in main
rv = self.invoke(ctx)
File "C:\Users\Tullio\AppData\Local\Programs\Python\Python39\lib\site-packages\click\core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "C:\Users\Tullio\AppData\Local\Programs\Python\Python39\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Users\Tullio\AppData\Local\Programs\Python\Python39\lib\site-packages\click\core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "C:\Users\Tullio\AppData\Local\Programs\Python\Python39\lib\site-packages\click\decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "C:\Users\Tullio\Desktop\dalle2\example_inference.py", line 43, in dream
dreamer: BasicInference = BasicInference.create(model_config, verbose=verbose)
File "C:\Users\Tullio\Desktop\dalle2\dalle2_laion\scripts\InferenceScript.py", line 29, in create
model_manager = DalleModelManager(config)
File "C:\Users\Tullio\Desktop\dalle2\dalle2_laion\dalle2_laion.py", line 92, in init
self.decoder_info = self.load_decoder(model_load_config.decoder)
File "C:\Users\Tullio\Desktop\dalle2\dalle2_laion\dalle2_laion.py", line 237, in load_decoder
return ModelInfo(decoder, decoder_version, requires_clip, decoder_data_requirements)
TypeError: ModelInfo() takes no arguments

gradio_inference: DuplicateBlockError

Hi. Thank you for sharing.
I got this error while trying to run the inference:
raise DuplicateBlockError(
gradio.exceptions.DuplicateBlockError: At least one block in this Blocks has already been rendered.

Could you advise about that?
Than you

Error has occurred in google colab notebook

Hello,

Thanks for your valuable work!
When I run notebook of dalle2-laion on google colab, the following error occurred.

AttributeError                            Traceback (most recent call last)
[<ipython-input-16-578a5f9162bc>](https://localhost:8080/#) in <module>
     86     img.save(f"./reranked_output/example_{index}.png")
     87 
---> 88 rerank_test(5, 50)

5 frames
[/usr/local/lib/python3.7/dist-packages/ftfy/__init__.py](https://localhost:8080/#) in fix_text(text, config, **kwargs)
    302     pos = 0
    303     while pos < len(text):
--> 304         textbreak = text.find("\n", pos) + 1
    305         if textbreak == 0:
    306             textbreak = len(text)

AttributeError: 'list' object has no attribute 'find'

If you has any idea, please let me know that.

Thank you.

Want to make a contribution ...

Hey guys, I want to make a contribution ... to this dalle2-laion, but don't know where to start from. Do you have any tutorial or documentation?

Dockerfile for WSL

:edit: made some small changes as I learn new things about Docker
I put together a Dockerfile for us poor WSL users, so we can run the notebook from our browser in Windows.

FROM nvidia/cuda:11.6.2-base-ubuntu20.04

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && \
    apt-get install python3 -y && \
    apt-get install python3-pip -y && \
    apt-get install git curl wget ffmpeg libsm6 libxext6 -y

RUN pip3 install torch torchvision --extra-index-url https://download.pytorch.org/whl/cu116

RUN pip3 install jupyterlab notebook numpy pillow

WORKDIR /root/data/

CMD ["python3", "-m", "jupyter", "notebook", "/root/data/", "-i 0.0.0.0", "--allow-root", "--no-browser", "--NotebookApp.token=''"]

Then download the jupyter notebook to some local drive.

Run with

start wsl.exe -e sh -c "docker run  -p 8888:8888 --network host --shm-size=200m --rm --gpus all -it -v <WSL path to jupyter notebook directory>:/root/data <your docker image name>"

Then open your browser to localhost:8888 That's it!

Note: the WSL path to the notebook will be like /mnt/c/.../ c being the C: drive

bad results when generating images with inference code

Hi,
I have run the example_inference dream
to generate images. but the results are not good quality.

for example for the prompt : " a woman giving a TED talk" , these are some results:

any tips?

Feature Request: Arg to update models

Adding an argument to check for a newer version of the models instead of automatically skipping would be nice.

diff --git a/example_inference.py b/example_inference.py
index 746b83e..4f9412a 100644
--- a/example_inference.py
+++ b/example_inference.py
@@ -96,6 +96,7 @@ def variation(ctx, model_config: str, output_path: str, decoder_batch_size: int)
 @inference.command()
 @click.option('--model-config', default='./configs/upsampler.example.json', help='Path to model config file')
 @click.option('--output-path', default='./output/inpaint/', help='Path to output directory')
+@click.option('--update', default=False, is_flag=True, help="Check for updates to the models and download if a newer one is found.")
 @click.pass_context
 def inpaint(ctx, model_config: str, output_path: str):
     verbose = ctx.obj['verbose']

The checksum of the model is available in the commit to Hugging Face: example, but I don't see an obvious way of finding this programmatically to compare against a local file.

Performance compared to dalle-mini

Hello,

I'm trying to compare the performance of dalle2 (this repo - colab) with dalle-mini (https://huggingface.co/spaces/dalle-mini/dalle-mini). The dalle-mini performs much better. I cannot get meaningful images from dalle2. Is there an specific setting I have to do to get a good result? Or it is not completely trained? Or sth else?

for example: "a programmer sitting in a coffeeshop."

dalle2 output:

dalle-mini output:

bad results when generating images with inference code

Hi
Thnks for you great sharing.

I have run the example_inference dream
to generate images. but the results are not satisfying. they seem irrelevent.

for example for the prompt : " smoke in a city " , these are some results:

No configs on Huggingface

I followed your instructions and went to https://huggingface.co/laion/DALLE2-PyTorch/tree/main to get some nice pretrained model. More precisely, a complete DALLe2 model to be used for text-to-image, as in the example at the end of https://github.com/LAION-AI/dalle2-laion/blob/main/dalle2_laion/README.md
There are several .pth files on Huggingface, but I cannot find their respective .json... Am I missing sth?

Inpainting using pretrained models

Hello,
First of all thanks for all the good work. I had a question regarding the quality of some of the inpainted images I am testing out. I used the models for example_inference.py and used the corgi example for inpainting and the prompt: "a yellow hat". These are the results I get. They look solid colored and not very realistic. Am I doing something wrong? Or is this just a limitation of the model.
Thanks again.

What pretraining datasets do you use?

Hello,

Thanks for your valuable work!
What are the pre-training datasets you use in each component, and what are the model parameters of the dalle2-laion.

Thank you.

example_inference.py script yields weird results

I just tried to run the example script after having installed dalle2-laion (I just changed setup.py so that dalle2-pytorch==1.1.0 is installed, to make downloaded models happy with the pytorch version in place),

but the results I get seem a bit odd - e.g. here is my stack trace for running the script:

# python3 example_inference.py dream
Enter your prompts one by one. Enter an empty prompt to finish.
Prompt 1: a cat playing piano
Prompt 2: 
How many samples would you like to generate for each prompt? [1]: 4
Downloading https://huggingface.co/laion/DALLE2-PyTorch/resolve/main/decoder/v1.0.2/latest.pth to models/new_decoder.pth
Downloading https://huggingface.co/Veldrovive/upsamplers/resolve/main/working/latest.pth to models/second_decoder.pth
WARNING: This decoder was trained on an old version of Dalle2. This may result in the model failing to load or it may lead to producing garbage results.
Downloading https://huggingface.co/Veldrovive/upsamplers/raw/main/working/decoder_config.json to models/second_decoder_config.json
Downloading https://huggingface.co/laion/DALLE2-PyTorch/resolve/main/prior/latest.pth to models/prior.pth
WARNING: This prior was trained on an old version of Dalle2. This may result in the model failing to load or it may produce garbage results.
Downloading https://huggingface.co/laion/DALLE2-PyTorch/raw/main/prior/prior_config.json to models/prior_config.json

And here is an example of results I'm getting: https://ibb.co/SmpTFHs

Is this normal/expected? Am I missing something?

from dalle2_laion import ModelLoadConfig, DalleModelManager
from dalle2_laion.scripts import InferenceScript

class ExampleInference(InferenceScript):
    def run(self, text: str) -> PILImage.Image:
        """
        Takes a string and returns a single image.
        """
        text = [text]
        image_embedding_map = self._sample_prior(text)
        image_embedding = image_embedding_map[0][0]
        image_map = self._sample_decoder(text=text, image_embed=image_embedding)
        return image_map[0][0]

model_config = ModelLoadConfig.from_json_path("path/to/config.json")
model_manager = DalleModelManager(model_config)
inference = ExampleInference(model_manager)
image = inference.run("Hello World")

But encountered
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x10 and 768x1280)
Any idea on why? Thanks!

Below is the full error track back:

│ /shared/nas/data/m1/wangz3/phy-lm-vid/third_party/dalle2-laion/dalle2_laion/scripts/playaround.p │
│ y:64 in <module>                                                                                 │
│                                                                                                  │
│   61 model_config = ModelLoadConfig.from_json_path("/shared/nas/data/m1/wangz3/phy-lm-vid/thi    │
│   62 model_manager = DalleModelManager(model_config)                                             │
│   63 inference = ExampleInference(model_manager)                                                 │
│ ❱ 64 output_im = inference.run("Hello World")                                                    │
│   65 output_im.save(f"test_output_image.jpg")                                                    │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/phy-lm-vid/third_party/dalle2-laion/dalle2_laion/scripts/playaround.p │
│ y:58 in run                                                                                      │
│                                                                                                  │
│   55 │   │   text = [text]                                                                       │
│   56 │   │   image_embedding_map = self._sample_prior(text)                                      │
│   57 │   │   image_embedding = image_embedding_map[0][0]                                         │
│ ❱ 58 │   │   image_map = self._sample_decoder(text=text, image_embed=image_embedding)            │
│   59 │   │   return image_map[0][0]                                                              │
│   60                                                                                             │
│   61 model_config = ModelLoadConfig.from_json_path("/shared/nas/data/m1/wangz3/phy-lm-vid/thi    │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/phy-lm-vid/third_party/dalle2-laion/dalle2_laion/scripts/InferenceScr │
│ ipt.py:227 in _sample_decoder                                                                    │
│                                                                                                  │
│   224 │   │   │   │   │   args["inpaint_image"] = inpaint_image_tensors.to(self.device)          │
│   225 │   │   │   │   │   args["inpaint_mask"] = torch.stack(inpaint_image_masks).to(self.devi   │
│   226 │   │   │   │   │   self.print(f"image tensor shape: {args['inpaint_image'].shape}. mask   │
│ ❱ 227 │   │   │   │   output_images = decoder.sample(**args, cond_scale=cond_scale)              │
│   228 │   │   │   │   for output_image, input_embedding_number in zip(output_images, embedding   │
│   229 │   │   │   │   │   if input_embedding_number not in output_image_map:                     │
│   230 │   │   │   │   │   │   output_image_map[input_embedding_number] = []                      │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/torch/autograd/ │
│ grad_mode.py:28 in decorate_context                                                              │
│                                                                                                  │
│    25 │   │   @functools.wraps(func)                                                             │
│    26 │   │   def decorate_context(*args, **kwargs):                                             │
│    27 │   │   │   with self.__class__():                                                         │
│ ❱  28 │   │   │   │   return func(*args, **kwargs)                                               │
│    29 │   │   return cast(F, decorate_context)                                                   │
│    30 │                                                                                          │
│    31 │   def _wrap_generator(self, func):                                                       │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/dalle2_pytorch/ │
│ dalle2_pytorch.py:95 in inner                                                                    │
│                                                                                                  │
│     92 │   def inner(model, *args, **kwargs):                                                    │
│     93 │   │   was_training = model.training                                                     │
│     94 │   │   model.eval()                                                                      │
│ ❱   95 │   │   out = fn(model, *args, **kwargs)                                                  │
│     96 │   │   model.train(was_training)                                                         │
│     97 │   │   return out                                                                        │
│     98 │   return inner                                                                          │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/dalle2_pytorch/ │
│ dalle2_pytorch.py:2809 in sample                                                                 │
│                                                                                                  │
│   2806 │   │   │   │                                                                             │
│   2807 │   │   │   │   # denoising loop for image                                                │
│   2808 │   │   │   │                                                                             │
│ ❱ 2809 │   │   │   │   img = self.p_sample_loop(                                                 │
│   2810 │   │   │   │   │   unet,                                                                 │
│   2811 │   │   │   │   │   shape,                                                                │
│   2812 │   │   │   │   │   image_embed = image_embed,                                            │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/torch/autograd/ │
│ grad_mode.py:28 in decorate_context                                                              │
│                                                                                                  │
│    25 │   │   @functools.wraps(func)                                                             │
│    26 │   │   def decorate_context(*args, **kwargs):                                             │
│    27 │   │   │   with self.__class__():                                                         │
│ ❱  28 │   │   │   │   return func(*args, **kwargs)                                               │
│    29 │   │   return cast(F, decorate_context)                                                   │
│    30 │                                                                                          │
│    31 │   def _wrap_generator(self, func):                                                       │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/dalle2_pytorch/ │
│ dalle2_pytorch.py:2661 in p_sample_loop                                                          │
│                                                                                                  │
│   2658 │   │   is_ddim = timesteps < num_timesteps                                               │
│   2659 │   │                                                                                     │
│   2660 │   │   if not is_ddim:                                                                   │
│ ❱ 2661 │   │   │   return self.p_sample_loop_ddpm(*args, noise_scheduler = noise_scheduler, **k  │
│   2662 │   │                                                                                     │
│   2663 │   │   return self.p_sample_loop_ddim(*args, noise_scheduler = noise_scheduler, timeste  │
│   2664                                                                                           │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/torch/autograd/ │
│ grad_mode.py:28 in decorate_context                                                              │
│                                                                                                  │
│    25 │   │   @functools.wraps(func)                                                             │
│    26 │   │   def decorate_context(*args, **kwargs):                                             │
│    27 │   │   │   with self.__class__():                                                         │
│ ❱  28 │   │   │   │   return func(*args, **kwargs)                                               │
│    29 │   │   return cast(F, decorate_context)                                                   │
│    30 │                                                                                          │
│    31 │   def _wrap_generator(self, func):                                                       │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/dalle2_pytorch/ │
│ dalle2_pytorch.py:2533 in p_sample_loop_ddpm                                                     │
│                                                                                                  │
│   2530 │   │   │   │   │   noised_inpaint_image = noise_scheduler.q_sample(inpaint_image, t = t  │
│   2531 │   │   │   │   │   img = (img * ~inpaint_mask) + (noised_inpaint_image * inpaint_mask)   │
│   2532 │   │   │   │                                                                             │
│ ❱ 2533 │   │   │   │   img = self.p_sample(                                                      │
│   2534 │   │   │   │   │   unet,                                                                 │
│   2535 │   │   │   │   │   img,                                                                  │
│   2536 │   │   │   │   │   times,                                                                │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/torch/autograd/ │
│ grad_mode.py:28 in decorate_context                                                              │
│                                                                                                  │
│    25 │   │   @functools.wraps(func)                                                             │
│    26 │   │   def decorate_context(*args, **kwargs):                                             │
│    27 │   │   │   with self.__class__():                                                         │
│ ❱  28 │   │   │   │   return func(*args, **kwargs)                                               │
│    29 │   │   return cast(F, decorate_context)                                                   │
│    30 │                                                                                          │
│    31 │   def _wrap_generator(self, func):                                                       │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/dalle2_pytorch/ │
│ dalle2_pytorch.py:2476 in p_sample                                                               │
│                                                                                                  │
│   2473 │   @torch.no_grad()                                                                      │
│   2474 │   def p_sample(self, unet, x, t, image_embed, noise_scheduler, text_encodings = None,   │
│   2475 │   │   b, *_, device = *x.shape, x.device                                                │
│ ❱ 2476 │   │   model_mean, _, model_log_variance = self.p_mean_variance(unet, x = x, t = t, ima  │
│   2477 │   │   noise = torch.randn_like(x)                                                       │
│   2478 │   │   # no noise when t == 0                                                            │
│   2479 │   │   nonzero_mask = (1 - (t == 0).float()).reshape(b, *((1,) * (len(x.shape) - 1)))    │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/dalle2_pytorch/ │
│ dalle2_pytorch.py:2442 in p_mean_variance                                                        │
│                                                                                                  │
│   2439 │   def p_mean_variance(self, unet, x, t, image_embed, noise_scheduler, text_encodings =  │
│   2440 │   │   assert not (cond_scale != 1. and not self.can_classifier_guidance), 'the decoder  │
│   2441 │   │                                                                                     │
│ ❱ 2442 │   │   pred = default(model_output, lambda: unet.forward_with_cond_scale(x, t, image_em  │
│   2443 │   │                                                                                     │
│   2444 │   │   if learned_variance:                                                              │
│   2445 │   │   │   pred, var_interp_frac_unnormalized = pred.chunk(2, dim = 1)                   │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/dalle2_pytorch/ │
│ dalle2_pytorch.py:64 in default                                                                  │
│                                                                                                  │
│     61 def default(val, d):                                                                      │
│     62 │   if exists(val):                                                                       │
│     63 │   │   return val                                                                        │
│ ❱   64 │   return d() if callable(d) else d                                                      │
│     65                                                                                           │
│     66 def cast_tuple(val, length = None, validate = True):                                      │
│     67 │   if isinstance(val, list):                                                             │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/dalle2_pytorch/ │
│ dalle2_pytorch.py:2442 in <lambda>                                                               │
│                                                                                                  │
│   2439 │   def p_mean_variance(self, unet, x, t, image_embed, noise_scheduler, text_encodings =  │
│   2440 │   │   assert not (cond_scale != 1. and not self.can_classifier_guidance), 'the decoder  │
│   2441 │   │                                                                                     │
│ ❱ 2442 │   │   pred = default(model_output, lambda: unet.forward_with_cond_scale(x, t, image_em  │
│   2443 │   │                                                                                     │
│   2444 │   │   if learned_variance:                                                              │
│   2445 │   │   │   pred, var_interp_frac_unnormalized = pred.chunk(2, dim = 1)                   │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/dalle2_pytorch/ │
│ dalle2_pytorch.py:1854 in forward_with_cond_scale                                                │
│                                                                                                  │
│   1851 │   │   cond_scale = 1.,                                                                  │
│   1852 │   │   **kwargs                                                                          │
│   1853 │   ):                                                                                    │
│ ❱ 1854 │   │   logits = self.forward(*args, **kwargs)                                            │
│   1855 │   │                                                                                     │
│   1856 │   │   if cond_scale == 1:                                                               │
│   1857 │   │   │   return logits                                                                 │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/dalle2_pytorch/ │
│ dalle2_pytorch.py:1916 in forward                                                                │
│                                                                                                  │
│   1913 │   │   # discovered by @mhh0318 in the paper                                             │
│   1914 │   │                                                                                     │
│   1915 │   │   if exists(image_embed) and exists(self.to_image_hiddens):                         │
│ ❱ 1916 │   │   │   image_hiddens = self.to_image_hiddens(image_embed)                            │
│   1917 │   │   │   image_keep_mask_hidden = rearrange(image_keep_mask, 'b -> b 1')               │
│   1918 │   │   │   null_image_hiddens = self.null_image_hiddens.to(image_hiddens.dtype)          │
│   1919                                                                                           │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/torch/nn/module │
│ s/module.py:1102 in _call_impl                                                                   │
│                                                                                                  │
│   1099 │   │   # this function, and just call forward.                                           │
│   1100 │   │   if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o  │
│   1101 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                   │
│ ❱ 1102 │   │   │   return forward_call(*input, **kwargs)                                         │
│   1103 │   │   # Do not call functions when jit is used                                          │
│   1104 │   │   full_backward_hooks, non_full_backward_hooks = [], []                             │
│   1105 │   │   if self._backward_hooks or _global_backward_hooks:                                │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/torch/nn/module │
│ s/container.py:141 in forward                                                                    │
│                                                                                                  │
│   138 │   # with Any as TorchScript expects a more precise type                                  │
│   139 │   def forward(self, input):                                                              │
│   140 │   │   for module in self:                                                                │
│ ❱ 141 │   │   │   input = module(input)                                                          │
│   142 │   │   return input                                                                       │
│   143                                                                                            │
│   144                                                                                            │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/torch/nn/module │
│ s/module.py:1102 in _call_impl                                                                   │
│                                                                                                  │
│   1099 │   │   # this function, and just call forward.                                           │
│   1100 │   │   if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o  │
│   1101 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                   │
│ ❱ 1102 │   │   │   return forward_call(*input, **kwargs)                                         │
│   1103 │   │   # Do not call functions when jit is used                                          │
│   1104 │   │   full_backward_hooks, non_full_backward_hooks = [], []                             │
│   1105 │   │   if self._backward_hooks or _global_backward_hooks:                                │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/torch/nn/module │
│ s/linear.py:103 in forward                                                                       │
│                                                                                                  │
│   100 │   │   │   init.uniform_(self.bias, -bound, bound)                                        │
│   101 │                                                                                          │
│   102 │   def forward(self, input: Tensor) -> Tensor:                                            │
│ ❱ 103 │   │   return F.linear(input, self.weight, self.bias)                                     │
│   104 │                                                                                          │
│   105 │   def extra_repr(self) -> str:                                                           │
│   106 │   │   return 'in_features={}, out_features={}, bias={}'.format(                          │
│                                                                                                  │
│ /shared/nas/data/m1/wangz3/miniconda/envs/phy-lm-vid/lib/python3.9/site-packages/torch/nn/functi │
│ onal.py:1848 in linear                                                                           │
│                                                                                                  │
│   1845 │   """                                                                                   │
│   1846 │   if has_torch_function_variadic(input, weight, bias):                                  │
│   1847 │   │   return handle_torch_function(linear, (input, weight, bias), input, weight, bias=  │
│ ❱ 1848 │   return torch._C._nn.linear(input, weight, bias)                                       │
│   1849                                                                                           │
│   1850                                                                                           │
│   1851 def bilinear(input1: Tensor, input2: Tensor, weight: Tensor, bias: Optional[Tensor] = No

How to combine the huggingface pretrained models with you repository?

https://huggingface.co/nousr/conditioned-prior

I would like to use the text 2 image model.

laion-ai / dalle2-laion Goto Github PK

dalle2-laion's People

Contributors

Stargazers

Watchers

Forkers

dalle2-laion's Issues

Recommend Projects

Recommend Topics

Recommend Org