Comments (3)
Arc770 and iGPU can't working on the same env, we are still working on it, related issue: #10940
But the error is different, should be RuntimeError: could not create a primitive
. This difference may be caused by your different torch version.
from bigdl.
Got it! I will remove ARC770 to test my iGPU again in MTL.
BTW I also test the same SW environment in my TGL platform (Corei7-1185G7) and the iGPU indeed works well.
- SW environment in Ubuntu 22.04+kernel v6.8.2
intel_extension_for_pytorch 2.1.20+git0e2bee2
torch 2.1.0.post0+cxx11.abi
torchvision 0.16.0+fbb4cc5
intel-openmp 2024.1.0
openvino 2024.1.0
openvino-telemetry 2024.1.0
[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2 [2024.17.3.0.08_160000]
[opencl:cpu:1] Intel(R) OpenCL, 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz OpenCL 3.0 (Build 0) [2024.17.3.0.08_160000]
[opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Iris(R) Xe Graphics OpenCL 3.0 NEO [24.13.29138.7]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Iris(R) Xe Graphics 1.3 [1.3.29138]
- Test result
(llm-test) intel@myDUT:~/work/ipex-llm/python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama3$ ONEAPI_DEVICE_SELECTOR=level_zero:0 python ./generate.py --repo-id-or-model-path meta-llama/Meta-Llama-3-8B-Instruct --prompt 'History of Intel' --n-predict 64
2024-05-15 09:57:23,463 - INFO - intel_extension_for_pytorch auto imported
/home/intel/anaconda3/envs/llm-test/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 5.16it/s]
2024-05-15 09:57:25,302 - INFO - Converting the current model to sym_int4 format......
/home/intel/anaconda3/envs/llm-test/lib/python3.11/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.
Inference time: 9.984711408615112 s
-------------------- Prompt --------------------
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
History of Intel<|eot_id|><|start_header_id|>assistant<|end_header_id|>
-------------------- Output (skip_special_tokens=False) --------------------
<|begin_of_text|><|begin_of_text|><|start_header_id|>user<|end_header_id|>
History of Intel<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Intel Corporation is an American multinational corporation that specializes in the design and manufacture of microprocessors, memory chips, and other semiconductor technologies. Here is a brief history of the company:
**Early Years (1968-1979)**
Intel was founded on July 18, 1968, by Gordon Moore and Robert N
@qiuxin2012 , I appreciate your support.
from bigdl.
@qiuxin2012 . I confirmed MTL-H iGPU works well without ARC770 in platform.
[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2 [2024.17.3.0.08_160000]
[opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) Ultra 7 155H OpenCL 3.0 (Build 0) [2024.17.3.0.08_160000]
[opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) Graphics OpenCL 3.0 NEO [24.13.29138.7]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) Graphics 1.3 [1.3.29138]
...
(llm) intel@mydevice:~/work/ipex-llm/python/llm/example/GPU/HF-Transformers-AutoModels/Model/llama3$ ONEAPI_DEVICE_SELECTOR=level_zero:0 python ./generate.py --repo-id-or-model-path meta-ll
ama/Meta-Llama-3-8B-Instruct --prompt 'History of Intel' --n-predict 64
2024-05-15 10:36:33,547 - INFO - intel_extension_for_pytorch auto imported
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 5.48it/s]
2024-05-15 10:36:34,559 - INFO - Converting the current model to sym_int4 format......
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation.
Inference time: 6.857227563858032 s
-------------------- Prompt --------------------
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
History of Intel<|eot_id|><|start_header_id|>assistant<|end_header_id|>
-------------------- Output (skip_special_tokens=False) --------------------
<|begin_of_text|><|begin_of_text|><|start_header_id|>user<|end_header_id|>
History of Intel<|eot_id|><|start_header_id|>assistant<|end_header_id|>
The legendary Intel!
Intel Corporation is an American multinational corporation that specializes in the design and manufacture of microprocessors, the "brain" of modern computers. Here's a brief history of the company:
**Early Years (1968-1971)**
Intel was founded on July 18, 1968, by Gordon
LIBXSMM_VERSION: main_stable-1.17-3651 (25693763)
LIBXSMM_TARGET: adl [Intel(R) Core(TM) Ultra 7 155H]
Registry and code: 13 MB
Command: python ./generate.py --repo-id-or-model-path meta-llama/Meta-Llama-3-8B-Instruct --prompt History of Intel --n-predict 64
Uptime: 63.459550 s
from bigdl.
Related Issues (20)
- [langchain-chatchat] ERROR: The expanded size of the tensor (559) must match the existing size (512) at non-singleton dimension 1. Target sizes: [1, 559]. Tensor sizes: [1, 512] HOT 2
- Unable to get LanguageBind/Video-LLaVA-7B-hf model working through ipex-llm HOT 11
- ipex-llm[cpp] error: Sub-group size 8 is not supported on the device HOT 3
- MTL Linux Qwen-VL: LLVM ERROR: GenXCisaBuilder failed HOT 1
- Support for MTL-H & MTL-U iGPU on Linux HOT 1
- try to test multi xpu with example HOT 14
- miniCPM run benchmark get error in iGPU HOT 1
- Shape Mismatch with Checkpoint for Deepspeed Zero3 HOT 1
- [script issue] - newly created checkpoint already contain a file. HOT 3
- [Feature]internlm-xcomposer2-vl-7b support HOT 2
- Qwen-7B TypeError: qwen_attention_forward() got an unexpected keyword argument 'registered_causal_mask' HOT 2
- ipex-llm(0517) Failed to Run 'baichuan-inc/Baichuan2-7B-Chat' in batch_size==2 and batch_size==4 with 32-32, 1024-128, 2048-256 input_length HOT 1
- Qwen-7B-Chat fail with larger 6.7k for second or 3rd time HOT 2
- Ollama Linux No Response Issue with IPEX-LLM HOT 2
- Qwen1.5-4b and Qwen1.5-7b model cannot be loaded correctly in ipex-llm version 20240522 HOT 7
- [inference]: fine tuned model fails to do inferencing HOT 1
- ModuleNotFoundError: No module named 'ipex_llm.vllm.xpu' while using docker and installation HOT 1
- [integration]: merging bfloat16 model failed HOT 2
- all-in-one with version 2.1.0b1 failed HOT 3
- need an easy way to roll back driver installs HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bigdl.