siliconflow / onediff Goto Github PK
View Code? Open in Web Editor NEWOneDiff: An out-of-the-box acceleration library for diffusion models.
Home Page: https://github.com/siliconflow/onediff/wiki
OneDiff: An out-of-the-box acceleration library for diffusion models.
Home Page: https://github.com/siliconflow/onediff/wiki
This is my performance of oneflow diffusers on P100
its 1.68it/s
But my performance on official stable-diffusion, there is 2.32it/s on PLMS Sampler
Its doesn't support P100 card yet?
Thank you so much!
No response
No response
diffusers
version: 0.4.0.dev0Hi,
I tried to infer the dreambooth model with oneflow. Oneflow reduces the infer time by 1/2 than normal inference time and its amazing. Then I tried to train dreambooth model with Oneflow to reduce the training time. But my training stuck here
accelerator = Accelerator(
gradient_accumulation_steps=args.gradient_accumulation_steps,
mixed_precision=args.mixed_precision,
log_with=args.report_to,
logging_dir=logging_dir,
)
and then after 30mins timed out.
I used
import oneflow as torch
Kindly, guide me how to train it with oneflow.
thanks
每次使用img2img时,unet都会重新编译,导致没办法投入实际使用。
猜测是由于每次用户传入的初始图尺寸、比例不同的原因
No response
No response
Oneflow + cuda11.6
参考 https://github.com/huggingface/diffusers/blob/main/examples/community/lpw_stable_diffusion.py
可以实现大于77token的prompt输入,以及使用()[]等为prompt中的元素进行加权强调,很实用。
请求支持一下StableDiffusionLongPromptWeightingPipeline的Oneflow框架运行版本。
Traceback (most recent call last):
File "/home/user/Software/Anaconda/envs/test/lib/python3.8/site-packages/transformers/utils/import_utils.py", line 1050, in _get_module
return importlib.import_module("." + module_name, self.name)
File "/home/user/Software/Anaconda/envs/test/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
File "", line 991, in _find_and_load
File "", line 975, in _find_and_load_unlocked
File "", line 671, in _load_unlocked
File "", line 783, in exec_module
File "", line 219, in _call_with_frames_removed
File "/home/user/Software/Anaconda/envs/test/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py", line 27, in
from ...modeling_utils import PreTrainedModel
File "/home/user/Software/Anaconda/envs/test/lib/python3.8/site-packages/transformers/modeling_utils.py", line 41, in
from .generation_utils import GenerationMixin
File "/home/user/Software/Anaconda/envs/test/lib/python3.8/site-packages/transformers/generation_utils.py", line 61, in
from .pytorch_utils import torch_int_div
File "/home/user/Software/Anaconda/envs/test/lib/python3.8/site-packages/transformers/pytorch_utils.py", line 19, in
from torch import _softmax_backward_data, nn
File "", line 1039, in _handle_fromlist
File "/home/user/Software/Anaconda/envs/test/lib/python3.8/site-packages/oneflow/mock_torch/init.py", line 42, in getattr
raise NotImplementedError(self.module.name + "." + name + error_msg)
NotImplementedError: oneflow._softmax_backward_data is not implemented, please submit an issue at
'https://github.com/Oneflow-Inc/oneflow/issues' including the log information of the error, the
minimum reproduction code, and the system information.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/user/Desktop/yao/projects/Text2img/AIdraw/en/utils/scheduling_ddim_oneflow.py", line 9, in
from diffusers.configuration_utils import ConfigMixin, register_to_config
File "/home/user/Software/Anaconda/envs/test/lib/python3.8/site-packages/diffusers/init.py", line 22, in
from transformers import CLIPTextModel, CLIPFeatureExtractor
File "", line 1039, in _handle_fromlist
File "/home/user/Software/Anaconda/envs/test/lib/python3.8/site-packages/transformers/utils/import_utils.py", line 1041, in getattr
value = getattr(module, name)
File "/home/user/Software/Anaconda/envs/test/lib/python3.8/site-packages/transformers/utils/import_utils.py", line 1040, in getattr
module = self._get_module(self._class_to_module[name])
File "/home/user/Software/Anaconda/envs/test/lib/python3.8/site-packages/transformers/utils/import_utils.py", line 1052, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.clip.modeling_clip because of the following error (look up to see its traceback):
oneflow._softmax_backward_data is not implemented, please submit an issue at
'https://github.com/Oneflow-Inc/oneflow/issues' including the log information of the error, the
minimum reproduction code, and the system information.
I couldn't find any information about those. Could you give me some sample code?
request to add oneflow pipe as the diffusers repo code file:
pipeline_stable_diffusion_inpaint.py
The running environment is wsl2 Ubuntu 20.04, neither the host nor wsl2 is running any other CUDA programs.
ubuntu@DESKTOP-531RKJN:~$ python3 diffusers/tests/test_pipelines_oneflow_graph_load.py
libibverbs not available, ibv_fork_init skipped
==> Try to run graph save...
==> get_pipe try to run
get_pipe cuda mem before 1301.5
Fetching 12 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 56488.94it/s]
get_pipe run time 15.074813842773438
get_pipe cuda mem after 1301.5
get_pipe cuda mem diff 0.0
<== get_pipe finish run
==> pipe_to_cuda try to run
pipe_to_cuda cuda mem before 1301.5
pipe_to_cuda run time 1.1066811084747314
pipe_to_cuda cuda mem after 4061.5
pipe_to_cuda cuda mem diff 2760.0
<== pipe_to_cuda finish run
==> config_graph try to run
config_graph cuda mem before 4061.5
config_graph run time 1.5735626220703125e-05
config_graph cuda mem after 4061.5
config_graph cuda mem diff 0.0
<== config_graph finish run
sd init time 16.18261170387268 s.
==> text_to_image try to run
text_to_image cuda mem before 4061.5
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:09<00:00, 5.53it/s]
W20230210 00:32:48.454388 8336 cudnn_conv_util.cpp:102] Currently available alogrithm (algo=1, require memory=7472256, idx=1) meeting requirments (max_workspace_size=1073741824, determinism=0) is not fastest. Fastest algorithm (1) requires memory 1074922512
text_to_image run time 9.699114561080933
text_to_image cuda mem after 8125.5
text_to_image cuda mem diff 4064.0
<== text_to_image finish run
==> text_to_image try to run
text_to_image cuda mem before 8125.5
/home/ubuntu/.local/lib/python3.8/site-packages/oneflow/nn/modules/module.py:152: UserWarning: Interpolate() is called in a nn.Graph, but not registered into a nn.Graph.
warnings.warn(
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:15<00:00, 3.17it/s]
text_to_image run time 23.669822216033936
text_to_image cuda mem after 9561.5
text_to_image cuda mem diff 1436.0
<== text_to_image finish run
====> diff 0.0023254268
st init and run time 49.55777668952942 s.
==> save_pipe_sch try to run
save_pipe_sch cuda mem before 9561.5
terminate called after throwing an instance of 'oneflow::RuntimeException'
what(): Error: out of memory
Error message from /home/ci-user/runners/release/_work/oneflow/oneflow/oneflow/core/vm/op_call_instruction_policy.cpp:209
OpCallInstructionUtil::Compute(this, instruction): copy:OpCall:s_d2h
File "/home/ci-user/runners/release/_work/oneflow/oneflow/oneflow/core/vm/op_call_instruction_policy.cpp", line 209, in Compute
OpCallInstructionUtil::Compute(this, instruction)
File "/home/ci-user/runners/release/_work/oneflow/oneflow/oneflow/core/vm/op_call_instruction_policy.cpp", line 41, in Compute
AllocateOutputBlobsMemory(op_call_instruction_policy, allocator, instruction)
File "/home/ci-user/runners/release/_work/oneflow/oneflow/oneflow/core/vm/op_call_instruction_policy.cpp", line 89, in AllocateOutputBlobsMemory
blob_object->TryAllocateBlobBodyMemory(allocator)
File "/home/ci-user/runners/release/_work/oneflow/oneflow/oneflow/core/eager/eager_blob_object.cpp", line 100, in TryAllocateBlobBodyMemory
allocator->Allocate(&dptr, required_body_bytes)
File "/home/ci-user/runners/release/_work/oneflow/oneflow/oneflow/core/vm/bin_allocator.h", line 392, in Allocate
AllocateBlockToExtendTotalMem(aligned_size)
File "/home/ci-user/runners/release/_work/oneflow/oneflow/oneflow/core/vm/bin_allocator.h", line 305, in AllocateBlockToExtendTotalMem
backend_->Allocate(&mem_ptr, final_allocate_bytes)
File "/home/ci-user/runners/release/_work/oneflow/oneflow/oneflow/core/vm/ep_backend_host_allocator.cpp", line 25, in Allocate
ep_device_->AllocPinned(allocation_options_, reinterpret_cast<void**>(mem_ptr), size)
Error Type: oneflow.ErrorProto.runtime_error
File "/home/ci-user/runners/release/_work/oneflow/oneflow/oneflow/core/vm/op_call_instruction_policy.cpp", line 209, in operator()
Error Type: oneflow.ErrorProto.runtime_error
You can set ONEFLOW_DEBUG or ONEFLOW_PYTHON_STACK_GETTER to 1 to get the Python stack of the error.
Aborted
ubuntu@DESKTOP-531RKJN:~$
Originally posted by @MirrorCY in https://github.com/Oneflow-Inc/diffusers/issues/75#issuecomment-1424482749
all kinds of problems, depending on which versions of xformers installed. I tried the master dev branch, v0.0.13, v0.0.12. All had some exceptions. I believe the main problem is the system cannot cast verify the inputs with oneflow.float32 while it should be torch.float32. Which version of xformers is the current dev version of oneflow is testing against?
No response
No response
python3.10, cu117, the latest oneflow version.
When run fp16 version descrided in https://github.com/Oneflow-Inc/diffusers/wiki/How-to-Run-OneFlow-Stable-Diffusion, I find this error.
However, the fp32 version works well.
import oneflow as torch
from diffusers import OneFlowStableDiffusionPipeline
import timeit
pipe = OneFlowStableDiffusionPipeline.from_pretrained(
"./stable-diffusion-v1-4",
revision="fp16",
torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")
start = timeit.default_timer()
prompt = "a photo of an astronaut riding a horse on mars"
with torch.autocast("cuda"):
images = pipe(prompt).images
for i, image in enumerate(images):
image.save(f"{prompt}-of-{i}.png")
end = timeit.default_timer()
print('Running time: %s Seconds' % (end - start))
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
[oneflow] [vae] diffusers.OneFlowAutoencoderKL
[diffusers] [tokenizer] transformers.CLIPTokenizer
[oneflow] [unet] diffusers.OneFlowUNet2DConditionModel
[oneflow] [safety_checker] stable_diffusion.OneFlowStableDiffusionSafetyChecker
[oneflow] [scheduler] diffusers.OneFlowPNDMScheduler
[diffusers] [feature_extractor] transformers.CLIPFeatureExtractor
[oneflow] [text_encoder] transformers.OneFlowCLIPTextModel
[oneflow] compiling unet beforehand to make sure the progress bar is more accurate
[oneflow] [elapsed(s)] [unet compilation] 25.13586367602693
98%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████▋ | 50/51 [00:02<00:00, 22.99it/s]
Traceback (most recent call last):
File "demo.py", line 17, in <module>
images = pipe(prompt).images
File "/opt/conda/lib/python3.8/site-packages/oneflow/autograd/autograd_mode.py", line 154, in wrapper
return func(*args, **kwargs)
File "/code/diffusion/diffusers/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_oneflow.py", line 321, in __call__
latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs).prev_sample
File "/code/diffusion/diffusers/src/diffusers/schedulers/scheduling_pndm_oneflow.py", line 223, in step
return self.step_plms(model_output=model_output, timestep=timestep, sample=sample, return_dict=return_dict)
File "/code/diffusion/diffusers/src/diffusers/schedulers/scheduling_pndm_oneflow.py", line 338, in step_plms
prev_sample = self._get_prev_sample(sample, timestep, prev_timestep, model_output)
File "/code/diffusion/diffusers/src/diffusers/schedulers/scheduling_pndm_oneflow.py", line 362, in _get_prev_sample
if (alpha_prod_t_prev.dtype == torch.float64):
TypeError: Cannot interpret 'oneflow.float64' as a data type
diffusers
version: 0.4.0.dev0运行img2img时,加载模型有报错
import requests
import torch
from PIL import Image
from io import BytesIO
#from diffusers import StableDiffusionImg2ImgPipeline
from diffusers import OneFlowStableDiffusionImg2ImgPipeline as StableDiffusionImg2ImgPipeline
device = "cuda"
model_id_or_path = "stabilityai/stable-diffusion-2-1"
pipe = StableDiffusionImg2ImgPipeline.from_pretrained(model_id_or_path, torch_dtype=torch.float16)
model_id_or_path="./stable-diffusion-v1-5"
.generator = torch.Generator("cuda").manual_seed(42)
pipe = pipe.to(device)
response = requests.get(url)
init_image = Image.open(BytesIO(response.content)).convert("RGB")
init_image = init_image.resize((768, 512))
prompt = "A fantasy landscape, trending on artstation"
images = pipe(prompt=prompt, generator=generator, image=init_image, strength=0.75, guidance_scale=7.5).images
Fetching 16 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:00<00:00, 25940.81it/s]
The config attributes {'upcast_attention': True} were passed to OneFlowUNet2DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
Traceback (most recent call last):
File "sd-i.py", line 13, in <module>
pipe = StableDiffusionImg2ImgPipeline.from_pretrained(model_id_or_path, torch_dtype=torch.float16)
File "/root/zsf/oneflow/diffusers/src/diffusers/pipeline_oneflow_utils.py", line 706, in from_pretrained
loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
File "/root/zsf/oneflow/diffusers/src/diffusers/modeling_oneflow_utils.py", line 518, in from_pretrained
raise ValueError(
ValueError: torch.float16 needs to be of type `torch.dtype`, e.g. `torch.float16`, but is <class 'torch.dtype'>.
.......
Using the same seed and code, the generated results are still not the same as huggingface diffusers and look relatively fuzzy
I copy-and-pasted the docker script from wiki and tested on A100.
The inference speed was fast as you claimed, but the result seems not good.
The prompt was the default one, "a photo of an astronaut riding a horse on mars".
As shown in this image, there are two astronauts in the image. One of them has 3 legs while the other is riding a twisted motorcycle.
Do you have any idea why it's not producing a good image?
No response
No response
I used your docker image
diffusers仓库更新,原tests样例几乎全部无法执行。
参照新样例 https://github.com/Oneflow-Inc/diffusers/blob/main/examples/text_to_image_sd2.py
将 https://github.com/Oneflow-Inc/diffusers/blob/oneflow-fork/tests/test_pipelines_oneflow_graph_load.py 中的导入方式从
from diffusers import (
OneFlowStableDiffusionPipeline as StableDiffusionPipeline,
OneFlowEulerDiscreteScheduler as EulerDiscreteScheduler,
)
from diffusers import utils
改为
from onediff import OneFlowStableDiffusionPipeline as StableDiffusionPipeline
from diffusers import EulerDiscreteScheduler
from diffusers import utils
后,执行报错
==> Try to run graph save...
==> function get_pipe try to run...
get_pipe cuda mem before 2854.75 MB
get_pipe host mem before 1729.0 MB
Fetching 12 files: 100%|████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 49490.31it/s]
<frozen importlib._bootstrap>:283: DeprecationWarning: the load_module() method is deprecated and slated for removal in Python 3.12; use exec_module() instead
E
======================================================================
ERROR: test_sd_graph_save_and_load (__main__.OneFlowPipeLineGraphSaveLoadTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/zhaodi/work/test.py", line 171, in test_sd_graph_save_and_load
_test_sd_graph_save_and_load(True, f0 ,f1, f2)
File "/home/zhaodi/work/test.py", line 76, in _test_sd_graph_save_and_load
sch, pipe = get_pipe()
File "/home/zhaodi/work/test.py", line 28, in new_fn
out = fn(*args, **kwargs)
File "/home/zhaodi/work/test.py", line 72, in get_pipe
sd_pipe = StableDiffusionPipeline.from_pretrained(
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/diffusers/pipelines/pipeline_utils.py", line 739, in from_pretrained
loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2325, in from_pretrained
dtype_orig = cls._set_default_torch_dtype(torch_dtype)
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1109, in _set_default_torch_dtype
torch.set_default_dtype(dtype)
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/torch/__init__.py", line 395, in set_default_dtype
_C._set_default_dtype(d)
TypeError: invalid dtype object: only floating-point types are supported as the default type
----------------------------------------------------------------------
Ran 1 test in 8.609s
Linux
diffusers 0.12.1
onediff 0.1.0 /home/zhaodi/onediff/src
oneflow 0.8.0
diffusers 0.4.0.dev0
when i runned OneFlowStableDiffusionPipeline locally ,i met the problem,
/////////////////////
Traceback (most recent call last):
File "test_of.py", line 2, in
from diffusers import OneFlowStableDiffusionPipeline
File "/mnt/yinlong/project/disco/one-flow/diffusers/src/diffusers/init.py", line 21, in
from .models import AutoencoderKL, UNet2DConditionModel, UNet2DModel, VQModel
File "/mnt/yinlong/project/disco/one-flow/diffusers/src/diffusers/models/init.py", line 28, in
from .unet_2d_condition_oneflow import OneFlowUNet2DConditionModel
File "/mnt/yinlong/project/disco/one-flow/diffusers/src/diffusers/models/unet_2d_condition_oneflow.py", line 6, in
import oneflow.utils.checkpoint
ModuleNotFoundError: No module named 'oneflow.utils.checkpoint'
////////////////////////////////////
what is the problem?BUG?
No response
No response
the command failed too, met the same question above
Hello, is it possible to compile the model for dynamic resolution generation rather than static? Similar to TensorRT's?
I see in the code that the compilation call is made either if the model hasn't been compiled already OR the request is for a different resolution than the already compilated one.
OneFlowStableDiffusionImg2ImgPipeline 貌似没有适配。数据类型对不上
import oneflow as torch
from diffusers import OneFlowStableDiffusionPipeline as StableDiffusionPipeline, OneFlowDPMSolverMultistepScheduler as DPMSolverMultistepScheduler, OneFlowStableDiffusionImg2ImgPipeline as StableDiffusionImg2ImgPipeline
model_id = "./stable-diffusion-2-model"
scheduler = DPMSolverMultistepScheduler.from_pretrained(model_id, subfolder="scheduler")
img2img = StableDiffusionImg2ImgPipeline.from_pretrained(model_id, scheduler=scheduler, torch_dtype=torch.float16)
img2img = img2img.to("cuda")
r = img2img(prompt="a cat", image=image, num_inference_steps=25)
[ERROR](GRAPH:UNetGraph_0:UNetGraph) building graph got error.
ERROR:root:Internal Error: Exception msg InferDataType Failed. Expected kFloat, but got kFloat16
...
我用OneFlowStableDiffusionPipeline通过https://github.com/Oneflow-Inc/diffusers/blob/oneflow-fork/tests/test_pipelines_oneflow_graph_load.py 编译好的graph,但是用OneFlowStableDiffusionImg2ImgPipeline去用会报错。
from diffusers import OneFlowStableDiffusionImg2ImgPipeline
scheduler = DDIMScheduler(beta_start=0.00085, beta_end=0.012,
beta_schedule="scaled_linear", clip_sample=False, set_alpha_to_one=False,
steps_offset=1)
sd_pipe = OneFlowStableDiffusionImg2ImgPipeline.from_pretrained(
"3_pipe_file_path", scheduler=scheduler, revision="fp16", torch_dtype=torch.float16
)
sd_pipe.to("cuda:0")
sd_pipe.set_graph_compile_cache_size(5)
sd_pipe.enable_graph_share_mem()
sd_pipe.load_graph("1_graph_save_path", compile_unet=True, compile_vae=False)
prompt = "Pale green clouds,a castle with a garden full of flowers is above the clouds ,light effect,by Makoto Shinkai and Claude Monet,trending on behance,8K"
image = 'data/init_images/1.jpg'
img = sd_pipe(
prompt,
image=image,
strength=0.8,
height=512,
width=512,
num_inference_steps=50,
guidance_scale=10,
compile_unet=True,
compile_vae=False,
num_images_per_prompt=1,
eta=0.,
generator=None,
output_type="np",
).images
img_out = os.path.join('data/outputs/img2img/test', "%s_%s.%s" % (1, 1, 'jpg'))
Image.fromarray(img.astype(np.uint8)).save(img_out)
No response
LInux,diffusers=0.10.0.dev,oneflow=0.9.1
An error raise when num_images_per_prompt
is changed at runtime.
Code below is OK:
import oneflow as torch
from diffusers import (
OneFlowStableDiffusionPipeline as DiffusionPipeline,
OneFlowDPMSolverMultistepScheduler as DPMSolverMultistepScheduler,
)
model_id = "stabilityai/stable-diffusion-2"
# Use the Euler scheduler here instead
scheduler = DPMSolverMultistepScheduler.from_pretrained(model_id, subfolder="scheduler")
pipe = DiffusionPipeline.from_pretrained(model_id, scheduler=scheduler, revision="fp16", torch_dtype=torch.float16)
pipe = pipe.to("cuda")
prompt = "a photo of an astronaut riding a horse on mars"
images = pipe(prompt, height=768, width=768, num_images_per_prompt=1).images
print(len(images))
images = pipe(prompt, height=768, width=768, num_images_per_prompt=1).images # **same** num_images_per_promopt
print(len(images))
Code below which num_images_per_prompt
changes at run time, will raise Error:
import oneflow as torch
from diffusers import (
OneFlowStableDiffusionPipeline as DiffusionPipeline,
OneFlowDPMSolverMultistepScheduler as DPMSolverMultistepScheduler,
)
model_id = "stabilityai/stable-diffusion-2"
# Use the Euler scheduler here instead
scheduler = DPMSolverMultistepScheduler.from_pretrained(model_id, subfolder="scheduler")
pipe = DiffusionPipeline.from_pretrained(model_id, scheduler=scheduler, revision="fp16", torch_dtype=torch.float16)
pipe = pipe.to("cuda")
prompt = "a photo of an astronaut riding a horse on mars"
images = pipe(prompt, height=768, width=768, num_images_per_prompt=1).images
print(len(images))
images = pipe(prompt, height=768, width=768, num_images_per_prompt=2).images # **NOT same** num_images_per_promopt will raise Error
print(len(images))
Error says:
RuntimeError: nn.Graph ONLY accepts static inputs tensor meta, please check whether your input tensor meta each step is the same as the input of first call graph.
The excepted tensor meta is: shape=(2,4,96,96), dtype=oneflow.float16, device=cuda:0, but the actual tensor meta is: shape=(4,4,96,96), dtype=oneflow.float16, device=cuda:0
No response
python -m oneflow --doctor
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
path: ['/usr/local/miniconda3/envs/py3.10.8/lib/python3.10/site-packages/oneflow']
version: 0.8.1+cu112.git.2a86da23
git_commit: 2a86da23
cmake_build_type: Release
rdma: True
mlir: True
Run demo with docker:
docker run --rm -it --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -v ${HF_HOME}:${HF_HOME} -v ${PWD}:${PWD} -w ${PWD} -e HF_HOME=${HF_HOME} -e HUGGING_FACE_HUB_TOKEN=${HUGGING_FACE_HUB_TOKEN} oneflowinc/oneflow-sd:cu112 python3 /demos/oneflow-t2i.py --prompt "a photo of a cat riding a horse on mars"
No response
WARNING: CUDA Minor Version Compatibility mode ENABLED.
Using driver version 470.42.01 which has support for CUDA 11.4. This container
was built with CUDA 11.8 and will be run in Minor Version Compatibility mode.
CUDA Forward Compatibility is preferred over Minor Version Compatibility for use
with this container but was unavailable:
[[System has unsupported display driver / cuda driver combination (CUDA_ERROR_SYSTEM_DRIVER_MISMATCH) cuInit()=803]]
See https://docs.nvidia.com/deploy/cuda-compatibility/ for details.
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
Fetching 16 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:00<00:00, 2110.81it/s]
[oneflow] [text_encoder] transformers.OneFlowCLIPTextModel
[oneflow] [unet] diffusers.OneFlowUNet2DConditionModel
[oneflow] [safety_checker] stable_diffusion.OneFlowStableDiffusionSafetyChecker
[diffusers] [feature_extractor] transformers.CLIPFeatureExtractor
[oneflow] [scheduler] diffusers.OneFlowPNDMScheduler
[diffusers] [tokenizer] transformers.CLIPTokenizer
ftfy or spacy is not installed using BERT BasicTokenizer instead of ftfy.
[oneflow] [vae] diffusers.OneFlowAutoencoderKL
[oneflow] compiling unet beforehand to make sure the progress bar is more accurate
[oneflow] [elapsed(s)] [unet compilation] 23.832770048989914
0%| | 0/51 [00:00<?, ?it/s]F20221108 02:54:25.671985 157 fused_multi_head_attention_inference_kernel.cu:150] UNIMPLEMENTED
*** Check failure stack trace: ***
@ 0x7f35f204bf6a google::LogMessage::Fail()
@ 0x7f35f204c252 google::LogMessage::SendToLog()
@ 0x7f35f204bad7 google::LogMessage::Flush()
@ 0x7f35f204e649 google::LogMessageFatal::~LogMessageFatal()
@ 0x7f35e6042d62 oneflow::user_op::(anonymous namespace)::DispatchArchTag<>()
@ 0x7f35eba61e2a oneflow::user_op::(anonymous namespace)::DispatchArchTag<>()
@ 0x7f35eba584c5 oneflow::user_op::(anonymous namespace)::DispatchCutlassFmha()
@ 0x7f35eba65c4a oneflow::user_op::(anonymous namespace)::FusedMultiHeadAttentionInferenceKernel::Compute()
@ 0x7f35eac67523 oneflow::UserKernel::ForwardUserKernel()
@ 0x7f35eac676a4 oneflow::UserKernel::ForwardDataContent()
@ 0x7f35eac35747 oneflow::Kernel::Forward()
@ 0x7f35eac360c0 oneflow::Kernel::Launch()
@ 0x7f35eacda00d oneflow::(anonymous namespace)::LightActor<>::ProcessMsg()
@ 0x7f35eb23dfd0 oneflow::Thread::PollMsgChannel()
@ 0x7f35eb23e348 _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN7oneflow6ThreadC4ERKNS3_8StreamIdEEUlvE_EEEEE6_M_runEv
@ 0x7f35f20609af execute_native_thread_routine
@ 0x7f36c8db1609 start_thread
@ 0x7f36c8b70133 clone
Docker version: 20.10.14
Image: oneflowinc/oneflow-sd:cu112
Requesting support for SD v2.0 depth2image model:
pipeline_stable_diffusion_depth2img.py
We use OneFlowStableDiffusionPipeline
to run a pretrained model. It's a huge performance improvement over the original.
However if I want to increase the batch size to improve the average performance per image, this does not work. The gpu memory usage increased, but the average time consuming even increased as well.
I tried using a list of prompts (batchlized_prompt = [ p for p in range(num_images_per_prompt))
),
or pass in the num_images_per_prompt
parameter.
According to the callback, all the images are generating parallel, but the total time could never reduced.
Any advice is super appreciated.
I got an error while following https://github.com/Oneflow-Inc/diffusers/wiki/How-to-Run-OneFlow-Stable-Diffusion.
No response
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
[<ipython-input-32-9857761ca7ed>](https://localhost:8080/#) in <module>
2 from diffusers import OneFlowStableDiffusionPipeline
3
----> 4 pipe = OneFlowStableDiffusionPipeline.from_pretrained(
5 "CompVis/stable-diffusion-v1-4",
6 use_auth_token=True,
[/usr/local/lib/python3.8/dist-packages/diffusers/pipeline_oneflow_utils.py](https://localhost:8080/#) in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
714 class_candidates = {c: class_obj for c in importable_classes.keys()}
715 else:
--> 716 with torch.mock_torch.enable():
717 # else we just import it from the library.
718 library = importlib.import_module(library_name)
AttributeError: module 'torch' has no attribute 'mock_torch'
Colab.
Tried almost every versions untill 0.7.0dev0.
diffusers
version: 0.7.0.dev0Error when running OneFlow Stable Diffusion without docker
got error like:
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
W20221108 11:30:13.097517 41659 cuda_device_descriptor_class.cpp:48] initialization error
Code is the same as the provided example
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
W20221108 11:30:13.097517 41659 cuda_device_descriptor_class.cpp:48] initialization error
Fetching 16 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:00<00:00, 17067.36it/s]
[oneflow] [unet] diffusers.OneFlowUNet2DConditionModel
[oneflow] [text_encoder] transformers.OneFlowCLIPTextModel
[oneflow] [safety_checker] stable_diffusion.OneFlowStableDiffusionSafetyChecker
[oneflow] [vae] diffusers.OneFlowAutoencoderKL
[diffusers] [feature_extractor] transformers.CLIPFeatureExtractor
[diffusers] [tokenizer] transformers.CLIPTokenizer
ftfy or spacy is not installed using BERT BasicTokenizer instead of ftfy.
[oneflow] [scheduler] diffusers.OneFlowPNDMScheduler
E20221108 11:30:41.811245 41659 cuda_device_manager_factory.cpp:65] Failed to get cuda runtime version: initialization error
F20221108 11:30:41.811539 41659 scheduler.cpp:125] Check failed: err : initialization error (3)
*** Check failure stack trace: ***
@ 0x7f29565f0f6a google::LogMessage::Fail()
@ 0x7f29565f1252 google::LogMessage::SendToLog()
@ 0x7f29565f0ad7 google::LogMessage::Flush()
@ 0x7f29565f3649 google::LogMessageFatal::~LogMessageFatal()
@ 0x7f294ef4e2f0 oneflow::boxing::collective::ExecutorImpl::Init()
@ 0x7f294ef4f6b5 oneflow::boxing::collective::Scheduler::Impl::Impl()
@ 0x7f294ef4fc93 oneflow::boxing::collective::Scheduler::Scheduler()
@ 0x7f294e6fa0b2 oneflow::MultiClientSessionContext::TryInit()
@ 0x7f294e6fab5e oneflow::MultiClientSessionContext::TryInit()
@ 0x7f2a1e564dcf (unknown)
@ 0x7f2a1e3f8f79 (unknown)
@ 0x4ffdb7 cfunction_call
@ 0x4f95eb _PyObject_MakeTpCall.localalias
@ 0x50c73f method_vectorcall
@ 0x4f4d0c _PyEval_EvalFrameDefault
@ 0x5001ff _PyFunction_Vectorcall
@ 0x4f0913 _PyEval_EvalFrameDefault
@ 0x50c44e method_vectorcall
@ 0x4f4d0c _PyEval_EvalFrameDefault
@ 0x4f893d _PyObject_FastCallDictTstate.localalias
@ 0x509bc8 slot_tp_init
@ 0x4f963b _PyObject_MakeTpCall.localalias
@ 0x4f4fcb _PyEval_EvalFrameDefault
@ 0x5001ff _PyFunction_Vectorcall
@ 0x4f89ed _PyObject_FastCallDictTstate.localalias
@ 0x509bc8 slot_tp_init
@ 0x4f9956 type_call
@ 0x50d0d9 PyObject_Call
@ 0x4f2c32 _PyEval_EvalFrameDefault
@ 0x50c44e method_vectorcall
@ 0x4f1592 _PyEval_EvalFrameDefault
@ 0x599fe2 _PyEval_Vector
Aborted (core dumped)
diffusers
version: 0.4.0.dev0Stuck on loaded library: /lib/x86_64-linux-gnu/libibverbs.so.1
No response
No response
My cuda and GPU info:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03 Driver Version: 510.47.03 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 39C P8 15W / 70W | 0MiB / 15360MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
按照 https://github.com/Oneflow-Inc/diffusers/wiki/How-to-Run-OneFlow-Stable-Diffusion 中 Without Docker 的方式配置,所有操作都完成,但是执行 from diffusers import OneFlowStableDiffusionPipeline
报错
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/zhaodi/diffusers/src/diffusers/__init__.py", line 22, in <module>
from transformers import CLIPTextModel, CLIPFeatureExtractor
File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1101, in __getattr__
value = getattr(module, name)
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1100, in __getattr__
module = self._get_module(self._class_to_module[name])
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1112, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.clip.modeling_clip because of the following error (look up to see its traceback):
oneflow.cuda.amp.GradScaler is not implemented, please submit an issue at
'https://github.com/Oneflow-Inc/oneflow/issues' including the log information of the error, the
minimum reproduction code, and the system information.
使用这个方法可以解决导入报错:https://github.com/Oneflow-Inc/diffusers/issues/104#issuecomment-1434151151
我想要运行该样例:https://github.com/Oneflow-Inc/diffusers/blob/oneflow-fork/tests/test_pipelines_oneflow_graph_load.py
于是按照 https://github.com/Oneflow-Inc/diffusers/issues/104#issuecomment-1434151151 在 from diffusers import ... 之前先导入 transformers 模块,但是在运行中触发了新的错误
sd init time 373.8917450904846 s.
==> function text_to_image try to run...
text_to_image cuda mem before 4336.75 MB
text_to_image host mem before 9173.0 MB
E
======================================================================
ERROR: test_sd_graph_save_and_load (__main__.OneFlowPipeLineGraphSaveLoadTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/zhaodi/work/test.py", line 175, in test_sd_graph_save_and_load
_test_sd_graph_save_and_load(True, f0 ,f1, f2)
File "/home/zhaodi/work/test.py", line 141, in _test_sd_graph_save_and_load
no_g_images = text_to_image(prompt, (i, j), prefix=f"is_save_{str(is_save)}-", with_graph=False)
File "/home/zhaodi/work/test.py", line 32, in new_fn
out = fn(*args, **kwargs)
File "/home/zhaodi/work/test.py", line 117, in text_to_image
images = pipe(
File "/home/zhaodi/oneflow/python/oneflow/autograd/autograd_mode.py", line 154, in wrapper
return func(*args, **kwargs)
File "/home/zhaodi/diffusers/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_oneflow.py", line 620, in __call__
text_embeddings = self._encode_prompt(
File "/home/zhaodi/diffusers/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_oneflow.py", line 393, in _encode_prompt
text_embeddings = self.text_encoder(text_input_ids.to(device), attention_mask=attention_mask)
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 816, in forward
return self.text_model(
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 712, in forward
hidden_states = self.embeddings(input_ids=input_ids, position_ids=position_ids)
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 227, in forward
inputs_embeds = self.token_embedding(input_ids)
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/torch/nn/modules/sparse.py", line 160, in forward
return F.embedding(
File "/home/zhaodi/miniconda3/lib/python3.10/site-packages/torch/nn/functional.py", line 2210, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
TypeError: embedding(): argument 'indices' (position 2) must be Tensor, not Tensor
----------------------------------------------------------------------
Ran 1 test in 374.084s
FAILED (errors=1)
Linux
oneflow 0.9.1.dev20230216+cu117
transformers 4.26.1
diffusers 0.10.0.dev0
huggingface-hub 0.12.0
Hello, I am trying to replicate the work of the Stable Diffusion Pipeline so I can have more control over it.
More specifically, I want to only load specific (compiled) Unets and VAEs.
To do this I've written the following script:
https://gist.github.com/chavinlo/79776f50006698e477796c4c58083623
Everything goes well until inference on the unet is attempted, more precisely at L248: https://gist.github.com/chavinlo/79776f50006698e477796c4c58083623#file-test-py-L248
The model thinks its not compiled and tries to compile it, and returns the following error:
Traceback (most recent call last):
File "/root/node/test_the_test.py", line 5, in <module>
engine(json.load(open('/root/node/cfg/basic.json')))
File "/root/env/lib/python3.10/site-packages/oneflow/autograd/autograd_mode.py", line 154, in wrapper
return func(*args, **kwargs)
File "/root/node/test.py", line 248, in engine
noise_pred = unet_graph(latent_model_input, t, text_embeddings)
File "/root/diffusers/src/diffusers/oneflow_graph_compile_cache.py", line 61, in __call__
self.compile(*args, **kwargs)
File "/root/diffusers/src/diffusers/oneflow_graph_compile_cache.py", line 31, in compile
self.graph_._compile_from_shared(*args, **kwargs)
File "/root/env/lib/python3.10/site-packages/oneflow/nn/graph/graph.py", line 872, in _compile_from_shared
self._shared_graph._forward_job_proto.net.op
AttributeError: 'NoneType' object has no attribute 'net'
its trying to use _graph, that object seems to have the net attribute along with the rest that it's looking for
but the script searches it up as "_shared_graph"
right is the line where it breaks and left is the graph_ object above
So first when unet_graph() is called graph_.forward_job_proto is None
but at the end of it self.shared_graph is None and the prior graph is Attr Error
Then compile_from_shared is called again? this time self.graph is net
Then everything turns in to namerror where self is not defined and unet_graph is called again?
and then it fails with graph_ being net and _shared_graph being attr error
This is the lpw.py code if needed:
https://gist.github.com/chavinlo/9e4c5c6c8e0f82f882a04a3fe4e54d88
note that it's different from the other issue I created
Any suggestions?
File "/root/diffusers/src/diffusers/models/unet_2d_condition_oneflow.py", line 6, in
import oneflow.utils.checkpoint
import oneflow as torch
from diffusers import OneFlowStableDiffusionPipeline
pipe = OneFlowStableDiffusionPipeline.from_pretrained(
local_model_path,
use_auth_token=True,
revision="fp16",
torch_dtype=torch.float16,
)
Traceback (most recent call last):
File "demo_inference.py", line 8, in <module>
from diffusers import StableDiffusionPipeline
File "/root/diffusers/src/diffusers/__init__.py", line 21, in <module>
from .models import AutoencoderKL, UNet2DConditionModel, UNet2DModel, VQModel
File "/root/diffusers/src/diffusers/models/__init__.py", line 28, in <module>
from .unet_2d_condition_oneflow import OneFlowUNet2DConditionModel
File "/root/diffusers/src/diffusers/models/unet_2d_condition_oneflow.py", line 6, in <module>
import oneflow.utils.checkpoint
ModuleNotFoundError: No module named 'oneflow.utils.checkpoint'
Python version: 3.8.13 (default, Mar 28 2022, 11:38:47)
[GCC 7.5.0]
OS platform: Linux-4.19.0-19-amd64-x86_64-with-glibc2.17
OS architecture: x86_64
Torch version: 1.12.1+cu102
Cuda available: True
Cuda version: 10.2
CuDNN version: 7605
Number of GPUs available: 1
transformers version: 4.23.0.dev0
这些生成结果明显就很有问题,和huggingface diffuser的结果不太一样,是我哪里搞得不对吗?
import oneflow as torch
import time
from diffusers import OneFlowStableDiffusionPipeline
pipe = OneFlowStableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
use_auth_token=True,
revision="fp16",
torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")
prompt = "a photo of an astronaut riding a horse on mars"
for i in range(10):
torch.cuda.synchronize()
sampler_time = time.time()
with torch.autocast("cuda"):
images = pipe(prompt).images
torch.cuda.synchronize()
sampler1_time = time.time()
print('loop_time:',sampler1_time-sampler_time)
for j, image in enumerate(images):
image.save(f"{prompt}-of-{j}-{i}.png")
No response
diffusers-0.4.0.dev0
transformers-4.23.0.dev0
python3.8.13
pytorch 1.13.0a0+d0d6b1f
cuda11,8
请增加img2img的修改代码,目前只有txt2img的代码
hi,
I have a stable diffusion project that requires attn mask and would like to use oneflow for acceleration. I see that xformers already supports it, but it's still too slow compared to oneflow.So when will fused_multi_head_attention_inference support attn mask?
When I use locally trained with dreambooth, following error occurs.
AttributeError: module transformers has no attribute CLIPImageProcessor
In case of locally trained model with dreambooth,
model_index.json included following :
{
"_class_name": "StableDiffusionPipeline",
"_diffusers_version": "0.10.2",
"feature_extractor": [
"transformers",
"CLIPImageProcessor"
],
Traceback (most recent call last):
File "oneflow-test.py", line 17, in <module>
pipe = StableDiffusionPipeline.from_pretrained(
File "/src/diffusers/src/diffusers/pipeline_oneflow_utils.py", line 657, in from_pretrained
class_obj = getattr(library, class_name)
File "/src/transformers/src/transformers/utils/import_utils.py", line 1043, in __getattr__
raise AttributeError(f"module {self.__name__} has no attribute {name}")
AttributeError: module transformers has no attribute CLIPImageProcessor
Segmentation fault
diffusers
version: 0.10.0.dev0when I try:
from diffusers import StableDiffusionPipeline, OneFlowStableDiffusionPipeline
import oneflow as torch
# from diffusers import DDIMScheduler
model_path = "./xdiffusion"
prompt = "a cute girl, blue eyes, brown hair"
pipe = OneFlowStableDiffusionPipeline.from_pretrained(
model_path,
# scheduler=DDIMScheduler(
# beta_start=0.00085,
# beta_end=0.012,
# beta_schedule="scaled_linear",
# clip_sample=False,
# set_alpha_to_one=True,
# )
)
def dummy(images, **kwargs):
return images, False
pipe.safety_checker = dummy
pipe = pipe.to("cuda")
image = pipe(prompt, num_inference_steps=30).images[0]
image.save(f"output.png")
I got a error with:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 (positional 0) and cpu:0 (positional 1)!
No response
(ldm) [root@VM-0-3-centos models]# python test.py
libibverbs not available, ibv_fork_init skipped
[oneflow] [vae] diffusers.OneFlowAutoencoderKL
[diffusers] [tokenizer] transformers.CLIPTokenizer
[oneflow] [unet] diffusers.OneFlowUNet2DConditionModel
[oneflow] [safety_checker] stable_diffusion.OneFlowStableDiffusionSafetyChecker
[diffusers] [feature_extractor] transformers.CLIPFeatureExtractor
[oneflow] [text_encoder] transformers.OneFlowCLIPTextModel
[oneflow] [scheduler] diffusers.OneFlowDDIMScheduler
[oneflow] compiling unet beforehand to make sure the progress bar is more accurate
[oneflow] [elapsed(s)] [unet compilation] 32.545996212400496
0%| | 0/30 [00:00<?, ?it/s]
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /www/models/test.py:23 in <module> │
│ │
│ 20 │ return images, False │
│ 21 pipe.safety_checker = dummy │
│ 22 pipe = pipe.to("cuda") │
│ ❱ 23 image = pipe(prompt, num_inference_steps=30).images[0] │
│ 24 image.save(f"output.png") │
│ │
│ /root/.conda/envs/ldm/lib/python3.8/site-packages/oneflow/autograd/autograd_mode.py:154 in │
│ wrapper │
│ │
│ 151 │ def __call__(self, func): │
│ 152 │ │ def wrapper(*args, **kwargs): │
│ 153 │ │ │ with AutoGradMode(False): │
│ ❱ 154 │ │ │ │ return func(*args, **kwargs) │
│ 155 │ │ │
│ 156 │ │ return wrapper │
│ 157 │
│ │
│ /www/models/diffusers/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_oneflow │
│ .py:345 in __call__ │
│ │
│ 342 │ │ │ if isinstance(self.scheduler, LMSDiscreteScheduler): │
│ 343 │ │ │ │ latents = self.scheduler.step(noise_pred, i, latents, **extra_step_kwarg │
│ 344 │ │ │ else: │
│ ❱ 345 │ │ │ │ latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwarg │
│ 346 │ │ │ torch._oneflow_internal.profiler.RangePop() │
│ 347 │ │ │
│ 348 │ │ # scale and decode the image latents with vae │
│ │
│ /www/models/diffusers/src/diffusers/schedulers/scheduling_ddim_oneflow.py:259 in step │
│ │
│ 256 │ │ │
│ 257 │ │ # 3. compute predicted original sample from predicted noise also called │
│ 258 │ │ # "predicted x_0" of formula (12) from https://arxiv.org/pdf/2010.02502.pdf │
│ ❱ 259 │ │ pred_original_sample = (sample - beta_prod_t ** (0.5) * model_output) / alpha_pr │
│ 260 │ │ │
│ 261 │ │ # 4. Clip "predicted x_0" │
│ 262 │ │ if self.config.clip_sample: │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 (positional 0) and cpu:0 (positional 1)!
centos7.6
cuda11.3
python3.8
the negative prompt is very useful to improve the quality of image in txt2img, img2img and inpaint tasks.Is it possible for oneflow to support it?
Hi
感谢开源,在t4卡上,对比了下Aitempleate和oneflow,oneflow效果上比较稳定,但是虽然优化提速了很多,单条prompt生成(768*768, steps=20)差不多还是要5s,可以通过多卡来提升性能吗?
测试图片:
#####################
https://image.netfrp.com/uploads/63cd2778d8ad9.png
https://image.netfrp.com/uploads/63cd2799f0868.png
No response
Traceback (most recent call last):
File "/root/tasker-anime_txt2img/run.py", line 102, in execute_task
result = inpaint(model, prompt, negative_prompt, img_url, mask_url, seed, strength, scale, steps)
File "/root/tasker-anime_txt2img/api.py", line 251, in inpaint
drawer.draw(
File "/root/tasker-anime_txt2img/drawer_oneflow.py", line 191, in draw
result = pipe(**params)
File "/root/.conda/envs/ai/lib/python3.10/site-packages/oneflow/autograd/autograd_mode.py", line 154, in wrapper
return func(*args, **kwargs)
File "/root/diffusers-oneflow-fork/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint_legacy_oneflow.py", line 565, in __call__
device = self._execution_device
File "/root/diffusers-oneflow-fork/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint_legacy_oneflow.py", line 279, in _execution_device
if self.device != torch.device("meta") or not hasattr(self.unet, "_hf_hook"):
RuntimeError: Expected one of cpu, cuda device type at start of device string: meta
### System Info
diffusers-oneflow
transformer-oneflow
installed like this,
git clone https://github.com/Oneflow-Inc/diffusers.git
cd diffusers
python3 -m pip install -e .[oneflow]
No response
Traceback (most recent call last):
File "/home/terrance/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1110, in _get_module
return importlib.import_module("." + module_name, self.__name__)
File "/usr/lib/python3.10/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 883, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "/home/terrance/.local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 27, in <module>
from ...modeling_utils import PreTrainedModel
File "/home/terrance/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 83, in <module>
from accelerate import __version__ as accelerate_version
File "/home/terrance/.local/lib/python3.10/site-packages/accelerate/__init__.py", line 7, in <module>
from .accelerator import Accelerator
File "/home/terrance/.local/lib/python3.10/site-packages/accelerate/accelerator.py", line 27, in <module>
import torch.utils.hooks as hooks
File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 674, in _load_unlocked
File "<frozen importlib._bootstrap>", line 571, in module_from_spec
File "/home/terrance/.local/lib/python3.10/site-packages/oneflow/mock_torch/__init__.py", line 88, in create_module
raise NotImplementedError(oneflow_mod_fullname + error_msg)
NotImplementedError: oneflow.utils.hooks is not implemented, please submit an issue at
'https://github.com/Oneflow-Inc/oneflow/issues' including the log information of the error, the
minimum reproduction code, and the system information.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/terrance/oneflow/test_diffusion.py", line 2, in <module>
from diffusers import OneFlowStableDiffusionPipeline
File "/home/terrance/oneflow/diffusers/src/diffusers/__init__.py", line 22, in <module>
from transformers import CLIPTextModel, CLIPFeatureExtractor
File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist
File "/home/terrance/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1101, in __getattr__
value = getattr(module, name)
File "/home/terrance/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1100, in __getattr__
module = self._get_module(self._class_to_module[name])
File "/home/terrance/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1112, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.clip.modeling_clip because of the following error (look up to see its traceback):
oneflow.utils.hooks is not implemented, please submit an issue at
'https://github.com/Oneflow-Inc/oneflow/issues' including the log information of the error, the
minimum reproduction code, and the system information.
cuda 10.2
They simply can't. I am following the example provided in the Wiki.
I have tried loading the scheduler like this:
diffusers.OneFlowDPMSolverMultistepScheduler.from_config(sch_source, subfolder="scheduler", algorithm_type="dpmsolver")
following the example on the Wiki
then, passing it to the pipe like this:
if scheduler in schedulers:
scheduler_obj = schedulers[scheduler]
scheduler_obj.set_timesteps(steps, device=device)
#scheduler_obj is a OneFlowDPMSolverMultistepScheduler object.
else:
imgq('fail', f'scheduler {scheduler} not found')
continue
with torch.autocast("cuda"):
pipe.scheduler = scheduler_obj
image = pipe(
prompt=prompt,
num_inference_steps=steps,
guidance_scale=cfg,
negative_prompt=negative_prompt,
height=image_height,
width=image_width,
generator=torch.Generator().manual_seed(seed)
)[0][0]
Traceback (most recent call last):
File "/workspace/node/threads/base.py", line 77, in image_generator
image = pipe(
File "/workspace/env/lib/python3.10/site-packages/oneflow/autograd/autograd_mode.py", line 154, in wrapper
return func(*args, **kwargs)
File "/workspace/diffusers/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_oneflow.py", line 676, in __call__
noise_pred = unet_graph(latent_model_input, t, text_embeddings)
File "/workspace/diffusers/src/diffusers/oneflow_graph_compile_cache.py", line 63, in __call__
return self.graph_(*args, **kwargs)
File "/workspace/env/lib/python3.10/site-packages/oneflow/nn/graph/graph.py", line 250, in __call__
return self.__run(*args, **kwargs)
File "/workspace/env/lib/python3.10/site-packages/oneflow/nn/graph/graph.py", line 1481, in __run
oneflow._oneflow_internal.nn.graph.RunLazyNNGraph(
oneflow._oneflow_internal.exception.RuntimeError: Error: nn.Graph ONLY accepts static inputs tensor meta, please check whether your input tensor meta each step is the same as the input of first call graph.
The excepted tensor meta is: shape=(), dtype=oneflow.int64, device=cpu:0, but the actual tensor meta is: shape=(), dtype=oneflow.float64, device=cpu:0. The input index is 1.
Using OneFlow latest version
Python 3.10.10
Torch 1.13.1
CUDA 12.0
Driver 525.78
A100-SXM4-80GB
Hello, I am trying to add Prompt extension and weighting by slightly modifying the Stable Diffusion Pipeline.
I do this by replacing the pipeline._encode_prompt
with lpw_pipe._encode_prompt
.
This is the lpw script: https://gist.github.com/chavinlo/b7ebc7e7dea59e311dab564fd452ff3c#file-lpw-py-L393
import oneflow as torch
import torch as og_torch
from .lpw import LongPromptWeightingPipeline
#load the text_model and tokenizer to be used on LPW
text_model = CLIPTextModel.from_pretrained(default_model, subfolder="text_encoder")
tokenizer_model = CLIPTokenizer.from_pretrained(default_model, subfolder="tokenizer")
text_model = text_model.to("cuda")
lpw_pipe = LongPromptWeightingPipeline(text_model, tokenizer_model, prompt_multiplier)
...
#Here I load multiple models from a configuration file.
pipe_map = dict()
for model in config['models']:
print("Loading model:", model['model_path'])
tmp_pipe = OneFlowStableDiffusionPipeline.from_pretrained(
pretrained_model_name_or_path=model['model_path'],
use_auth_token=True,
torch_dtype=torch.float16
)
tmp_pipe.to("cuda")
tmp_pipe._encode_prompt = lpw_pipe._encode_prompt
tmp_pipe.enable_graph_share_mem()
tmp_prompt = "Anime girl, beautiful"
tmp_neg_prompt = "Disgusting, Horrible"
for resolution in resultant_resolutions:
print("Doing resolution:", resolution)
with torch.autocast("cuda"):
tmp_pipe(
prompt=tmp_prompt,
negative_prompt=tmp_neg_prompt,
height=resolution[1],
width=resolution[0]
)
pipe_map[model['alias']] = tmp_pipe
In normal circustances it exits due to assertionerror on assert og_torch.cuda.is_initialized() is False
@ https://github.com/Oneflow-Inc/diffusers/blob/oneflow-fork/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_oneflow.py#L709
If this assertion is removed, it goes through but uses 3 times the VRAM per resolution round.
Heres the complete script: https://gist.github.com/chavinlo/d8005ebda6499853891c9edae8765b4b
import os
import oneflow as torch
from diffusers import OneFlowStableDiffusionPipeline
pipe = OneFlowStableDiffusionPipeline.from_pretrained(
'~/.cache/huggingface/diffusers/models--CompVis--stable-diffusion-v1-4/',
revision="fp16",
torch_dtype=torch.float16
)
OSError: We couldn't connect to 'https://huggingface.co' to load this model, couldn't find it in the cached files and it looks like ~/.cache/huggingface/diffusers/models--CompVis--stable-diffusion-v1-4/ is not the path to a directory containing a model_index.json file.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/diffusers/installation#offline-mode'.
when I try:
pipe = OneFlowStableDiffusionPipeline.from_pretrained(
'./stable-diffusion-v1-4', # download from huggingface diffusers model
revision="fp16",
torch_dtype=torch.float16
)
RuntimeError: Error(s) in loading state_dict for OneFlowAutoencoderKL:
While copying the parameter "encoder.conv_in.weight", an exception occurred :
Traceback (most recent call last):
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/oneflow/nn/module.py", line 788, in _load_from_state_dict
param.copy_(input_param)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/oneflow/framework/tensor.py", line 341, in _copy
_copy_from_numpy_to_eager_local_tensor(self, other)
File "/opt/conda/envs/ldm/lib/python3.8/site-packages/oneflow/framework/tensor.py", line 290, in _copy_from_numpy_to_eager_local_tensor
assert np_arr.dtype == flow.convert_oneflow_dtype_to_numpy_dtype(
AssertionError
No response
No response
T4 gpu, 512*512
pytorch img2img pipline: 6GB
oneflow img2img pipline: from 6GB to 11GB peak
stable-diffusion-v1-5 fp16 model
No response
No response
centos 7, python 3.7
I am using oneflow with stable diffusion. If I generate the results in 512x512, it can generate the result in 1 second. If I change the width and height, it will generate the next result in ~10 seconds. Then it will generate normally afterwards on the same dimensions. So, a change in width and height causes the model to slow down for the first inference on the new dimensions.
A100 40 Gb.
Normal inference: ~1 second
Inference after change in dimensions (for first time): ~10 seconds
No response
This question is similar to #87, but it seems like I'm not facing the same root error. I had followed @daquexian's solution to install the diffusers using pip install
instead of pip install -e
but not works.
Besides, I can reproduce the issue using a py script instead of IPython or other similar notebooks.
According to the Reproduction
part, it seems like the error may comes from the pollution of mock_torch.enable
?
Any advice is super appreciated.
import oneflow as torch
import torch as og_torch
from diffusers import (
OneFlowStableDiffusionPipeline as StableDiffusionPipeline,
OneFlowEulerDiscreteScheduler as EulerDiscreteScheduler,
OneFlowDPMSolverMultistepScheduler as DPMSolverMultistepScheduler
)
MODEL_ID = "/path/to/a/local/checkpoint"
scheduler = EulerDiscreteScheduler.from_pretrained(MODEL_ID, subfolder="scheduler")
# scheduler = DPMSolverMultistepScheduler.from_config(MODEL_ID, subfolder="scheduler") # also reproducible
diffusion_t2m = StableDiffusionPipeline.from_pretrained(MODEL_ID, scheduler=scheduler, revision="fp16", torch_dtype=torch.float16) # passing og_torch.float16 also reproducible
I pasted the above code to a py script named as test_loader.py
.
Firstly, let me add some print code in diffusers/pipeline_oneflow_utils.py
around torch.mock_torch.enable
print(f"BEFORE: type: {torch} library_name {library_name} pid {os.getpid()} ppid {os.getppid()}")
traceback.print_stack()
with torch.mock_torch.enable():
print(f"IN: type: {torch} library_name {library_name} pid {os.getpid()} ppid {os.getppid()}")
# else we just import it from the library.
library = importlib.import_module(library_name)
class_obj = getattr(library, class_name)
This is the output
BEFORE: type: <module 'torch' (<oneflow.mock_torch.OneflowImporter object at 0x7f620662d220>)> library_name transformers pid 3134 ppid 95
File "test_loader.py", line 10, in <module>
diffusion_t2m = StableDiffusionPipeline.from_pretrained(MODEL_ID, scheduler=scheduler, revision="fp16", torch_dtype=torch.float16)
File "/opt/conda/lib/python3.8/site-packages/diffusers/pipeline_oneflow_utils.py", line 719, in from_pretrained
traceback.print_stack()
IN: type: <module 'torch' (<oneflow.mock_torch.OneflowImporter object at 0x7f620662d220>)> library_name transformers pid 3134 ppid 95
BEFORE: type: <module 'torch' from '/opt/conda/lib/python3.8/site-packages/torch/__init__.py'> library_name diffusers pid 3134 ppid 95
File "test_loader.py", line 10, in <module>
diffusion_t2m = StableDiffusionPipeline.from_pretrained(MODEL_ID, scheduler=scheduler, revision="fp16", torch_dtype=torch.float16)
File "/opt/conda/lib/python3.8/site-packages/diffusers/pipeline_oneflow_utils.py", line 719, in from_pretrained
traceback.print_stack()
Traceback (most recent call last):
File "test_loader.py", line 10, in <module>
diffusion_t2m = StableDiffusionPipeline.from_pretrained(MODEL_ID, scheduler=scheduler, revision="fp16", torch_dtype=torch.float16)
File "/opt/conda/lib/python3.8/site-packages/diffusers/pipeline_oneflow_utils.py", line 720, in from_pretrained
with torch.mock_torch.enable():
AttributeError: module 'torch' has no attribute 'mock_torch'
And then I imported oneflow as oneflow (😯)
Here is my new header:
...
import numpy as np
import oneflow as torch
import torch as og_torch
import oneflow as oneflow
import traceback
import diffusers
And then, I modified mock_torch.enable()
function as follow:
print(f"BEFORE: type: {torch} library_name {library_name} pid {os.getpid()} ppid {os.getppid()}")
traceback.print_stack()
# with torch.mock_torch.enable():
with oneflow.mock_torch.enable():
print(f"IN: type: {torch} library_name {library_name} pid {os.getpid()} ppid {os.getppid()}")
# else we just import it from the library.
library = importlib.import_module(library_name)
class_obj = getattr(library, class_name)
And I executed the test script again, it finally works
BEFORE: type: <module 'torch' (<oneflow.mock_torch.OneflowImporter object at 0x7f63c677c1c0>)> library_name diffusers pid 3204 ppid 95
File "test_loader.py", line 10, in <module>
diffusion_t2m = StableDiffusionPipeline.from_pretrained(MODEL_ID, scheduler=scheduler, revision="fp16", torch_dtype=torch.float16)
File "/opt/conda/lib/python3.8/site-packages/diffusers/pipeline_oneflow_utils.py", line 719, in from_pretrained
traceback.print_stack()
IN: type: <module 'torch' (<oneflow.mock_torch.OneflowImporter object at 0x7f63c677c1c0>)> library_name diffusers pid 3204 ppid 95
BEFORE: type: <module 'torch' from '/opt/conda/lib/python3.8/site-packages/torch/__init__.py'> library_name transformers pid 3204 ppid 95
File "test_loader.py", line 10, in <module>
diffusion_t2m = StableDiffusionPipeline.from_pretrained(MODEL_ID, scheduler=scheduler, revision="fp16", torch_dtype=torch.float16)
File "/opt/conda/lib/python3.8/site-packages/diffusers/pipeline_oneflow_utils.py", line 719, in from_pretrained
traceback.print_stack()
IN: type: <module 'torch' (<oneflow.mock_torch.OneflowImporter object at 0x7f63c677c1c0>)> library_name transformers pid 3204 ppid 95
BEFORE: type: <module 'torch' from '/opt/conda/lib/python3.8/site-packages/torch/__init__.py'> library_name transformers pid 3204 ppid 95
File "test_loader.py", line 10, in <module>
diffusion_t2m = StableDiffusionPipeline.from_pretrained(MODEL_ID, scheduler=scheduler, revision="fp16", torch_dtype=torch.float16)
File "/opt/conda/lib/python3.8/site-packages/diffusers/pipeline_oneflow_utils.py", line 719, in from_pretrained
traceback.print_stack()
IN: type: <module 'torch' (<oneflow.mock_torch.OneflowImporter object at 0x7f63c677c1c0>)> library_name transformers pid 3204 ppid 95
BEFORE: type: <module 'torch' from '/opt/conda/lib/python3.8/site-packages/torch/__init__.py'> library_name diffusers pid 3204 ppid 95
File "test_loader.py", line 10, in <module>
diffusion_t2m = StableDiffusionPipeline.from_pretrained(MODEL_ID, scheduler=scheduler, revision="fp16", torch_dtype=torch.float16)
File "/opt/conda/lib/python3.8/site-packages/diffusers/pipeline_oneflow_utils.py", line 719, in from_pretrained
traceback.print_stack()
IN: type: <module 'torch' (<oneflow.mock_torch.OneflowImporter object at 0x7f63c677c1c0>)> library_name diffusers pid 3204 ppid 95
The config attributes {'class_embed_type': None, 'mid_block_type': 'UNetMidBlock2DCrossAttn', 'resnet_time_scale_shift': 'default', 'upcast_attention': False} were passed to OneFlowUNet2DConditionModel, but are not expected and will be ignored. Please verify your config.json configuration file.
No response
diffusers
version: 0.10.0.dev0 ( commit sha ea94536539aa1f17511b83a85fa08e3b9f989411
)root@f1be73ed83cb:/tmp# pip freeze | grep -E 'oneflow|torch'
oneflow==0.9.0
onnx @ file:///opt/pytorch/pytorch/third_party/onnx
pytorch-quantization==2.1.2
torch==1.12.0+cu116
torch-tensorrt @ file:///opt/pytorch/torch_tensorrt/py/dist/torch_tensorrt-1.1.0a0-cp38-cp38-linux_x86_64.whl
torchaudio==0.12.0+cu116
torchtext==0.13.0
torchvision==0.13.0+cu116
root@f1be73ed83cb:/tmp# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Thu_Feb_10_18:23:41_PST_2022
Cuda compilation tools, release 11.6, V11.6.112
Build cuda_11.6.r11.6/compiler.30978841_0
oneflow.cuda.amp.GradScaler is not implemented, please submit an issue at
'https://github.com/Oneflow-Inc/oneflow/issues' including the log information of the error, the
minimum reproduction code, and the system information.
No response
No response
Unbunt RTX3090
Stable Diffusion 2.0 is released. Just see https://github.com/Stability-AI/StableDiffusion.
The conv_out weight and bias have AssertionError
While copying the parameter "conv_out.weight", an exception occurred :
Traceback (most recent call last):
File "/home/kang/anaconda3/envs/kyt/lib/python3.8/site-packages/oneflow/nn/module.py", line 788, in
load_from_state_dict
param.copy(input_param)
File "/home/kang/anaconda3/envs/kyt/lib/python3.8/site-packages/oneflow/framework/tensor.py", line 341,
in _copy
_copy_from_numpy_to_eager_local_tensor(self, other)
File "/home/kang/anaconda3/envs/kyt/lib/python3.8/site-packages/oneflow/framework/tensor.py", line 290,
in _copy_from_numpy_to_eager_local_tensor
assert np_arr.dtype == flow.convert_oneflow_dtype_to_numpy_dtype(
AssertionError
.
While copying the parameter "conv_out.bias", an exception occurred :
Traceback (most recent call last):
File "/home/kang/anaconda3/envs/kyt/lib/python3.8/site-packages/oneflow/nn/module.py", line 788, in
load_from_state_dict
param.copy(input_param)
File "/home/kang/anaconda3/envs/kyt/lib/python3.8/site-packages/oneflow/framework/tensor.py", line 341,
in _copy
_copy_from_numpy_to_eager_local_tensor(self, other)
File "/home/kang/anaconda3/envs/kyt/lib/python3.8/site-packages/oneflow/framework/tensor.py", line 290,
in _copy_from_numpy_to_eager_local_tensor
assert np_arr.dtype == flow.convert_oneflow_dtype_to_numpy_dtype(
AssertionError
No response
diffusers-cli env
loaded library: /usr/lib/x86_64-linux-gnu/libibverbs.so.1
Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.
diffusers
version: 0.4.0.dev0Install 'diffusers' follow by
git clone https://github.com/Oneflow-Inc/diffusers.git
cd diffusers
python3 -m pip install -e .[oneflow]
but when run the case, it push out ImportError: cannot import name 'OneFlowStableDiffusionPipeline' from 'diffusers' (unknown location)
error.
No response
Python 3.7.3 (default, Jan 22 2021, 20:04:44)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from diffusers import OneFlowStableDiffusionPipeline
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name 'OneFlowStableDiffusionPipeline' from 'diffusers' (unknown location)
cuda version: 11.3
torch version: 1.10.0
Python 3.10.0 (default, Mar 3 2022, 09:58:08) [GCC 7.5.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.6.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from diffusers import OneFlowStableDiffusionPipeline
╭───────────────────────────── Traceback (most recent call last) ──────────────────────────────╮
│ :1 in │
╰──────────────────────────────────────────────────────────────────────────────────────────────╯
ImportError: cannot import name 'OneFlowStableDiffusionPipeline' from 'diffusers'
(/home/arthur/miniconda3/envs/dd/lib/python3.10/site-packages/diffusers/init.py)
In [2]: import diffusers
In [3]: diffusers.version
Out[3]: '0.12.1'
In [4]:
No response
No response
diffusers 0.12.1
ubuntu 20.04
docker image: oneflowinc/oneflow-sd:cu112
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
cuda version does not match!!
0
0
0
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.