Coder Social home page Coder Social logo

Comments (24)

strint avatar strint commented on May 15, 2024

You may need to update the transformers.

Delete the local oneflow version of transformers, directly use the official

python3 -m pip install transformers>=4.26

cd diffusers

python3 -m pip install -e .[oneflow]

Reference: https://github.com/Oneflow-Inc/diffusers/pull/83#discussion_r1092913239

from onediff.

yuanms2 avatar yuanms2 commented on May 15, 2024

The cause of the error is explained here:

huggingface/transformers#20796

from onediff.

Yaodada12 avatar Yaodada12 commented on May 15, 2024

You may need to update the transformers.

Delete the local oneflow version of transformers, directly use the official

python3 -m pip install transformers>=4.26

cd diffusers

python3 -m pip install -e .[oneflow]

Reference: Oneflow-Inc/diffusers#83 (comment)

好的,多谢大佬。我发现里面谈到编译共享的问题,现在静态图编译是已经支持动态尺寸推理了吗?

from onediff.

Yaodada12 avatar Yaodada12 commented on May 15, 2024

The cause of the error is explained here:

huggingface/transformers#20796

多谢大佬!!

from onediff.

strint avatar strint commented on May 15, 2024

我发现里面谈到编译共享的问题,现在静态图编译是已经支持动态尺寸推理了吗?

可以参见这里更新的评论:https://github.com/Oneflow-Inc/diffusers/issues/75#issuecomment-1418789541

@Yaodada12

from onediff.

Yaodada12 avatar Yaodada12 commented on May 15, 2024

我发现里面谈到编译共享的问题,现在静态图编译是已经支持动态尺寸推理了吗?

可以参见这里更新的评论:Oneflow-Inc/diffusers#75 (comment)

@Yaodada12

好的,点赞。

from onediff.

terrancewang avatar terrancewang commented on May 15, 2024

Hello, I had the same issue as this post. After updating the transformers package, I now have a similar error:

Traceback (most recent call last):
File "/home/terrance/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1110, in _get_module
return importlib.import_module("." + module_name, self.name)
File "/usr/lib/python3.10/importlib/init.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1050, in _gcd_import
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 688, in _load_unlocked
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/home/terrance/.local/lib/python3.10/site-packages/transformers/models/clip/modeling_clip.py", line 27, in
from ...modeling_utils import PreTrainedModel
File "/home/terrance/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 83, in
from accelerate import version as accelerate_version
File "/home/terrance/.local/lib/python3.10/site-packages/accelerate/init.py", line 7, in
from .accelerator import Accelerator
File "/home/terrance/.local/lib/python3.10/site-packages/accelerate/accelerator.py", line 27, in
import torch.utils.hooks as hooks
File "", line 1027, in _find_and_load
File "", line 1006, in _find_and_load_unlocked
File "", line 674, in _load_unlocked
File "", line 571, in module_from_spec
File "/home/terrance/.local/lib/python3.10/site-packages/oneflow/mock_torch/init.py", line 88, in create_module
raise NotImplementedError(oneflow_mod_fullname + error_msg)
NotImplementedError: oneflow.utils.hooks is not implemented, please submit an issue at
'https://github.com/Oneflow-Inc/oneflow/issues' including the log information of the error, the
minimum reproduction code, and the system information.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/terrance/oneflow/test_diffusion.py", line 2, in
from diffusers import OneFlowStableDiffusionPipeline
File "/home/terrance/oneflow/diffusers/src/diffusers/init.py", line 22, in
from transformers import CLIPTextModel, CLIPFeatureExtractor
File "", line 1075, in _handle_fromlist
File "/home/terrance/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1101, in getattr
value = getattr(module, name)
File "/home/terrance/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1100, in getattr
module = self._get_module(self._class_to_module[name])
File "/home/terrance/.local/lib/python3.10/site-packages/transformers/utils/import_utils.py", line 1112, in _get_module
raise RuntimeError(
RuntimeError: Failed to import transformers.models.clip.modeling_clip because of the following error (look up to see its traceback):
oneflow.utils.hooks is not implemented, please submit an issue at
'https://github.com/Oneflow-Inc/oneflow/issues' including the log information of the error, the
minimum reproduction code, and the system information.

Anyone seen this before?

from onediff.

yuanms2 avatar yuanms2 commented on May 15, 2024

oneflow.utils.hooks is not implemented,
discussed here:
https://github.com/Oneflow-Inc/diffusers/issues/90

from onediff.

jackalcooper avatar jackalcooper commented on May 15, 2024

looks like it has been resolved, feel free to reopen if not.

from onediff.

Yaodada12 avatar Yaodada12 commented on May 15, 2024

@strint 大佬,碰到个问题,代码安装3月份的oneflow-0.9.1.dev20230312+cu117会报错,你那有2月份的oneflow==0.9.1.dev20230216+cu117模型吗?

from onediff.

strint avatar strint commented on May 15, 2024

碰到个问题,代码安装3月份的oneflow-0.9.1.dev20230312+cu117会报错

是什么问题呢,可以发下错误信息和 oneflow 版本号,我们跟进修一下 @Yaodada12

from onediff.

Yaodada12 avatar Yaodada12 commented on May 15, 2024

@strint
版本:oneflow-0.9.1.dev20230312+cu117和oneflow-0.9.1.dev20230309+cu117都会报错,只有oneflow-0.9.1.dev20230216+cu117不报错。
还是之前那个问题。
F20230313 10:33:57.451692 1326001 user_op_conf.cpp:87] Check failed: attr.get() != nullptr attr_name: query_head_size
*** Check failure stack trace: ***
@ 0x7fe57e7a7c9a google::LogMessage::Fail()
@ 0x7fe57e7aabd1 google::LogMessage::SendToLog()
@ 0x7fe57e7a77c9 google::LogMessage::Flush()
@ 0x7fe57e7ab4b9 google::LogMessageFatal::~LogMessageFatal()
@ 0x7fe576a967a9 oneflow::user_op::UserOpConfWrapper::Attr4Name()
@ 0x7fe57866c50e oneflow::user_op::(anonymous namespace)::FusedMultiHeadAttentionInferenceKernel::Compute()
@ 0x7fe5776572f7 oneflow::UserKernel::ForwardUserKernel()
@ 0x7fe5776574bb oneflow::UserKernel::ForwardDataContent()
@ 0x7fe577623f63 oneflow::Kernel::Forward()
@ 0x7fe577624069 oneflow::Kernel::Launch()
@ 0x7fe577772e58 oneflow::(anonymous namespace)::LightActor<>::ProcessMsg()
@ 0x7fe577d4d62f oneflow::Thread::PollMsgChannel()
@ 0x7fe577d4dcc1 _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN7oneflow6ThreadC4ERKNS3_8StreamIdEEUlvE_EEEEE6_M_runEv
@ 0x7fe6b3ff9de4 (unknown)
@ 0x7fe72701e609 start_thread
@ 0x7fe726f43133 clone

Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

from onediff.

Yaodada12 avatar Yaodada12 commented on May 15, 2024

@strint 大佬有2月份模型的whl文件或者下载链接吗,我先应个急。

from onediff.

strint avatar strint commented on May 15, 2024

有2月份模型的whl文件或者下载链接吗,我先应个急。

@jackalcooper 知道哪里还有不,刚看了下 https://staging.oneflow.info/branch/master/cu117 都是最新的

from onediff.

Ldpe2G avatar Ldpe2G commented on May 15, 2024

https://oneflow-staging.oss-cn-beijing.aliyuncs.com/branch/master/cu117/58cb11e279c1f8932c16a3eb400ba5b063d6f912/oneflow-0.9.1.dev20230216%2Bcu117-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

https://oneflow-staging.oss-cn-beijing.aliyuncs.com/branch/master/cu117/58cb11e279c1f8932c16a3eb400ba5b063d6f912/oneflow-0.9.1.dev20230216%2Bcu117-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

https://oneflow-staging.oss-cn-beijing.aliyuncs.com/branch/master/cu117/58cb11e279c1f8932c16a3eb400ba5b063d6f912/oneflow-0.9.1.dev20230216%2Bcu117-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

https://oneflow-staging.oss-cn-beijing.aliyuncs.com/branch/master/cu117/58cb11e279c1f8932c16a3eb400ba5b063d6f912/oneflow-0.9.1.dev20230216%2Bcu117-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

from onediff.

Yaodada12 avatar Yaodada12 commented on May 15, 2024

@Ldpe2G @strint 多谢大佬们,上茶。

from onediff.

strint avatar strint commented on May 15, 2024

@strint 版本:oneflow-0.9.1.dev20230312+cu117和oneflow-0.9.1.dev20230309+cu117都会报错,只有oneflow-0.9.1.dev20230216+cu117不报错。 还是之前那个问题。 F20230313 10:33:57.451692 1326001 user_op_conf.cpp:87] Check failed: attr.get() != nullptr attr_name: query_head_size *** Check failure stack trace: *** @ 0x7fe57e7a7c9a google::LogMessage::Fail() @ 0x7fe57e7aabd1 google::LogMessage::SendToLog() @ 0x7fe57e7a77c9 google::LogMessage::Flush() @ 0x7fe57e7ab4b9 google::LogMessageFatal::~LogMessageFatal() @ 0x7fe576a967a9 oneflow::user_op::UserOpConfWrapper::Attr4Name() @ 0x7fe57866c50e oneflow::user_op::(anonymous namespace)::FusedMultiHeadAttentionInferenceKernel::Compute() @ 0x7fe5776572f7 oneflow::UserKernel::ForwardUserKernel() @ 0x7fe5776574bb oneflow::UserKernel::ForwardDataContent() @ 0x7fe577623f63 oneflow::Kernel::Forward() @ 0x7fe577624069 oneflow::Kernel::Launch() @ 0x7fe577772e58 oneflow::(anonymous namespace)::LightActor<>::ProcessMsg() @ 0x7fe577d4d62f oneflow::Thread::PollMsgChannel() @ 0x7fe577d4dcc1 _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN7oneflow6ThreadC4ERKNS3_8StreamIdEEUlvE_EEEEE6_M_runEv @ 0x7fe6b3ff9de4 (unknown) @ 0x7fe72701e609 start_thread @ 0x7fe726f43133 clone

Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

怎么触发这个问题呢,你用的 oneflow sd 的 commit id 可以帮忙发下

@liujuncheng 看这个报错 user_op_conf.cpp:87] Check failed: attr.get() != nullptr attr_name: query_head_size 是最近优化过的一个 op:Oneflow-Inc/oneflow#9963

from onediff.

Yaodada12 avatar Yaodada12 commented on May 15, 2024

@strint 好的,晚点我来看看

from onediff.

liujuncheng avatar liujuncheng commented on May 15, 2024

@strint 版本:oneflow-0.9.1.dev20230312+cu117和oneflow-0.9.1.dev20230309+cu117都会报错,只有oneflow-0.9.1.dev20230216+cu117不报错。 还是之前那个问题。 F20230313 10:33:57.451692 1326001 user_op_conf.cpp:87] Check failed: attr.get() != nullptr attr_name: query_head_size *** Check failure stack trace: *** @ 0x7fe57e7a7c9a google::LogMessage::Fail() @ 0x7fe57e7aabd1 google::LogMessage::SendToLog() @ 0x7fe57e7a77c9 google::LogMessage::Flush() @ 0x7fe57e7ab4b9 google::LogMessageFatal::~LogMessageFatal() @ 0x7fe576a967a9 oneflow::user_op::UserOpConfWrapper::Attr4Name() @ 0x7fe57866c50e oneflow::user_op::(anonymous namespace)::FusedMultiHeadAttentionInferenceKernel::Compute() @ 0x7fe5776572f7 oneflow::UserKernel::ForwardUserKernel() @ 0x7fe5776574bb oneflow::UserKernel::ForwardDataContent() @ 0x7fe577623f63 oneflow::Kernel::Forward() @ 0x7fe577624069 oneflow::Kernel::Launch() @ 0x7fe577772e58 oneflow::(anonymous namespace)::LightActor<>::ProcessMsg() @ 0x7fe577d4d62f oneflow::Thread::PollMsgChannel() @ 0x7fe577d4dcc1 _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN7oneflow6ThreadC4ERKNS3_8StreamIdEEUlvE_EEEEE6_M_runEv @ 0x7fe6b3ff9de4 (unknown) @ 0x7fe72701e609 start_thread @ 0x7fe726f43133 clone
Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

怎么触发这个问题呢,你用的 oneflow sd 的 commit id 可以帮忙发下

@liujuncheng 看这个报错 user_op_conf.cpp:87] Check failed: attr.get() != nullptr attr_name: query_head_size 是最近优化过的一个 op:Oneflow-Inc/oneflow#9963

有没有使用编译缓存之类的技术,如果有的话,要注意编译缓存是不能跨不同OneFlow的版本使用的,如果没有的话,看能不能提供一个复现脚本。

from onediff.

Yaodada12 avatar Yaodada12 commented on May 15, 2024

@strint 版本:oneflow-0.9.1.dev20230312+cu117和oneflow-0.9.1.dev20230309+cu117都会报错,只有oneflow-0.9.1.dev20230216+cu117不报错。 还是之前那个问题。 F20230313 10:33:57.451692 1326001 user_op_conf.cpp:87] Check failed: attr.get() != nullptr attr_name: query_head_size *** Check failure stack trace: *** @ 0x7fe57e7a7c9a google::LogMessage::Fail() @ 0x7fe57e7aabd1 google::LogMessage::SendToLog() @ 0x7fe57e7a77c9 google::LogMessage::Flush() @ 0x7fe57e7ab4b9 google::LogMessageFatal::~LogMessageFatal() @ 0x7fe576a967a9 oneflow::user_op::UserOpConfWrapper::Attr4Name() @ 0x7fe57866c50e oneflow::user_op::(anonymous namespace)::FusedMultiHeadAttentionInferenceKernel::Compute() @ 0x7fe5776572f7 oneflow::UserKernel::ForwardUserKernel() @ 0x7fe5776574bb oneflow::UserKernel::ForwardDataContent() @ 0x7fe577623f63 oneflow::Kernel::Forward() @ 0x7fe577624069 oneflow::Kernel::Launch() @ 0x7fe577772e58 oneflow::(anonymous namespace)::LightActor<>::ProcessMsg() @ 0x7fe577d4d62f oneflow::Thread::PollMsgChannel() @ 0x7fe577d4dcc1 _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN7oneflow6ThreadC4ERKNS3_8StreamIdEEUlvE_EEEEE6_M_runEv @ 0x7fe6b3ff9de4 (unknown) @ 0x7fe72701e609 start_thread @ 0x7fe726f43133 clone
Process finished with exit code 134 (interrupted by signal 6: SIGABRT)

怎么触发这个问题呢,你用的 oneflow sd 的 commit id 可以帮忙发下
@liujuncheng 看这个报错 user_op_conf.cpp:87] Check failed: attr.get() != nullptr attr_name: query_head_size 是最近优化过的一个 op:Oneflow-Inc/oneflow#9963

有没有使用编译缓存之类的技术,如果有的话,要注意编译缓存是不能跨不同OneFlow的版本使用的,如果没有的话,看能不能提供一个复现脚本。

我靠,还真是。之前离线编译模型用的是2月份的模型,现在用pip install --pre oneflow -f https://staging.oneflow.info/branch/master/cu117 安装的oneflow是3月份的,导致报错,切换回2月份的模型就可以了。。

from onediff.

strint avatar strint commented on May 15, 2024

Op 信息也属于 graph runtime_state_dict 的一部分(在执行计划 plan 中),最近版本更新了部分 Op 字段,这样之前版本保存的 runtime_state_dict,最近的版本不兼容了,导致 load graph 失败。

这个 case 还挺典型的,我新开个 issue 记录下。

from onediff.

Yaodada12 avatar Yaodada12 commented on May 15, 2024

Op 信息也属于 graph runtime_state_dict 的一部分(在执行计划 plan 中),最近版本更新了部分 Op 字段,这样之前版本保存的 runtime_state_dict,最近的版本不兼容了,导致 load graph 失败。

这个 case 还挺典型的,我新开个 issue 记录下。

给大佬们点赞。

from onediff.

Yaodada12 avatar Yaodada12 commented on May 15, 2024

https://oneflow-staging.oss-cn-beijing.aliyuncs.com/branch/master/cu117/58cb11e279c1f8932c16a3eb400ba5b063d6f912/oneflow-0.9.1.dev20230216%2Bcu117-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

https://oneflow-staging.oss-cn-beijing.aliyuncs.com/branch/master/cu117/58cb11e279c1f8932c16a3eb400ba5b063d6f912/oneflow-0.9.1.dev20230216%2Bcu117-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

https://oneflow-staging.oss-cn-beijing.aliyuncs.com/branch/master/cu117/58cb11e279c1f8932c16a3eb400ba5b063d6f912/oneflow-0.9.1.dev20230216%2Bcu117-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

https://oneflow-staging.oss-cn-beijing.aliyuncs.com/branch/master/cu117/58cb11e279c1f8932c16a3eb400ba5b063d6f912/oneflow-0.9.1.dev20230216%2Bcu117-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

@strint @Ldpe2G 大佬们,有oneflow-0.9.1在4.11号的whl文件吗?

from onediff.

strint avatar strint commented on May 15, 2024

上面的链接没有了应该就没有现成的了。

你是不是可以考虑使用下新版本?接口是兼容的。

from onediff.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.