tenstorrent / tt-buda-demos Goto Github PK
View Code? Open in Web Editor NEWRepository of model demos using TT-Buda
Repository of model demos using TT-Buda
Is your feature request related to a problem? Please describe.
As we have open-source our TT-Buda demos and benchmarking repositories, we anticipate a growing community of contributors. Maintaining high code quality and consistent formatting becomes paramount in this collaborative environment.
Describe the solution you'd like
To ensure this, we're implementing a GitHub Actions workflow triggered whenever a pull request is opened in our repositories.
Additional context
Github Actions - Auto Commit Linting , Cleaning
TT-Buda, developed by Tenstorrent, is a growing collection of model demos showcasing the capabilities of AI models running on Tenstorrent hardware. These demonstrations cover a wide range of applications, aiming to provide insights and inspiration for developers and researchers interested in advanced AI implementations.
We are excited to announce a bounty for contributing a new AI model demonstration to the TT-Buda repository. This is an opportunity for AI enthusiasts, researchers, and developers to showcase their skills, contribute to cutting-edge AI research, and earn rewards.
Integrate Phi-2 (2.7B) into the TT-Buda demonstrations.
model_demos
folder following the naming convention: model_yourModelName
.CONTRIBUTING.md
file.Contributions will be evaluated by the Tenstorrent team, and the best contribution will be eligible for $500 cash bounty.
Dive into AI development with the Grayskull DevKit, your gateway to exploring Tenstorrent's hardware. Paired with TT-Buda and TT-Metalium software approaches, it offers a solid foundation for AI experimentation. Secure your kit here.
Join our Discord to talk AI, share your journey, and get support from the Tenstorrent community and team. Let's innovate together!
We want to implement a feature that allows us to ensure current files already have the correct SPDX licenses and that any new PR's being merged will also adhere to this standard.
Before integrating the new functionality for screening new pull requests for SPDX license compliance, we need to conduct an audit on existing files. This audit will verify that all current files in the repository are compliant with their respective SPDX licenses or existing license(s).
One alternative approach is to manually check in PR. This method might be prone to human error.
This proactive measure will aid in preventing any future issues regarding licensing and will ensure that all contributions adhere to the required licensing standards.
The solution will be similar to the a approach currently being implement in the tenstorrent benchmarking repository currently open as a PR.
Describe the bug
Errors encountered when attempting to run the Bert NLP demo.
To Reproduce
Steps to reproduce the behavior:
from:https://github.com/tenstorrent/tt-buda-demos/blob/main/first_5_steps/2_running_nlp_models.ipynb
python3 --version
Python 3.8.10
python3 bert_buda.py
Expected behavior
System (please complete the following information):
OS: Ubuntu 20.04 focal
Kernel: x86_64 Linux 5.15.0-102-generic
Uptime: 46m
Packages: 1950
Shell: bash 5.0.17
Disk: 27G / 1.9T (2%)
CPU: AMD Ryzen 9 7950X3D 16-Core @ 32x 4.2GHz
GPU: AMD/ATI
RAM: 3278MiB / 192425MiB
Additional context
Add any other context about the problem here.
2024-04-16 05:21:39.757 | DEBUG | pybuda.ttdevice:_create_intermediates_queue_device_connector:1421 - Creating fwd intermediates queue connector on TTDevice 'tt_device_0'
2024-04-16 05:21:39.757 | DEBUG | pybuda.ttdevice:_create_forward_output_queue_device_connector:1401 - Creating forward output queue connector on TTDevice 'tt_device_0'
2024-04-16 05:21:39.757 | ERROR | pybuda.run.impl:_start_device_processes:1180 - Process spawn error:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
2024-04-16 05:21:39.757 | DEBUG | pybuda.run.impl:_shutdown:1265 - PyBuda shutdown
2024-04-16 05:21:39.757 | DEBUG | pybuda.run.impl:_shutdown:1281 - Waiting until processes done
2024-04-16 05:21:39.757 | WARNING | pybuda.tensor:pytorch_dtype_to_buda_dataformat:594 - Parameter is int64. Setting to int8 for now.
2024-04-16 05:21:39.757 | WARNING | pybuda.tensor:pytorch_dtype_to_buda_dataformat:594 - Parameter is int64. Setting to int8 for now.
2024-04-16 05:21:39.757 | WARNING | pybuda.tensor:pytorch_dtype_to_buda_dataformat:594 - Parameter is int64. Setting to int8 for now.
2024-04-16 05:21:39.757 | WARNING | pybuda.tensor:pytorch_dtype_to_buda_dataformat:594 - Parameter is int64. Setting to int8 for now.
2024-04-16 05:21:39.757 | WARNING | pybuda.tensor:pytorch_dtype_to_buda_dataformat:594 - Parameter is int64. Setting to int8 for now.
2024-04-16 05:21:39.758 | WARNING | pybuda.tensor:pytorch_dtype_to_buda_dataformat:594 - Parameter is int64. Setting to int8 for now.
2024-04-16 05:21:39.858 | DEBUG | pybuda.device:get_command_queue_response:311 - Ending process on CPUDevice 'cpu0_fallback' due to shutdown event
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/usr/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/usr/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/usr/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/usr/lib/python3.8/runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "/usr/lib/python3.8/runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/smduck/Downloads/pyb/bert_buda.py", line 59, in
output_q = pybuda.run_inference() # executes compilation (if first time) + runtime
File "/home/smduck/Downloads/pyb-20-04/env/lib/python3.8/site-packages/pybuda/run/api.py", line 90, in run_inference
return _run_inference(module, inputs, input_count, output_queue, _sequential, _perf_trace, _verify_cfg)
File "/home/smduck/Downloads/pyb-20-04/env/lib/python3.8/site-packages/pybuda/run/impl.py", line 277, in _run_inference
return _run_devices_inference(
File "/home/smduck/Downloads/pyb-20-04/env/lib/python3.8/site-packages/pybuda/run/impl.py", line 467, in _run_devices_inference
output_queue = _initialize_pipeline(False, output_queue, sequential=sequential, verify_cfg=verify_cfg)
File "/home/smduck/Downloads/pyb-20-04/env/lib/python3.8/site-packages/pybuda/run/impl.py", line 414, in _initialize_pipeline
_compile_devices(sequential, training=training, sample_inputs=sample_inputs, sample_targets=sample_targets, microbatch_count=microbatch_count, verify_cfg=verify_cfg)
File "/home/smduck/Downloads/pyb-20-04/env/lib/python3.8/site-packages/pybuda/run/impl.py", line 1250, in _compile_devices
raise RuntimeError(f"Compile failed for {d}")
RuntimeError: Compile failed for CPUDevice 'cpu0_fallback'
2024-04-16 05:21:39.861 | DEBUG | pybuda.run.impl:_shutdown:1265 - PyBuda shutdown
2024-04-16 05:21:39.861 | DEBUG | pybuda.run.impl:_shutdown:1281 - Waiting until processes done
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/home/smduck/Downloads/pyb-20-04/env/lib/python3.8/site-packages/pybuda/run/api.py", line 475, in shutdown
return _shutdown()
File "/home/smduck/Downloads/pyb-20-04/env/lib/python3.8/site-packages/pybuda/run/impl.py", line 1294, in _shutdown
p.terminate()
File "/usr/lib/python3.8/multiprocessing/process.py", line 133, in terminate
self._popen.terminate()
AttributeError: 'NoneType' object has no attribute 'terminate'
Meta's Llama 3 is currently the leading edge of <10B models and is highly rated on the user-preference based Chatbot Arena. Being able to run inference of such a powerful model could greatly increase the value and percieved potential of Tenstorrent's Grayskull accelerators.
Quantization is needed to fit an 8B model on the 8GB LPDDR4 of the Grayskull cards. This might be possible with the block floating point format Grayskull supports, specifically BFP4. This would make it probably more complicated than adding other models.
Given the complexity and the benefits that enabling inference of such a SOTA model would be, prioritising it internally or putting a significant bounty on it could be considered.
Describe the bug
pytorch_yolov3_holli_1x1.py is not work.
(python_env) dn-dev@grayskull-dev:~/tt-buda-demos/model_demos$ python3 ./cv_demos/yolo_v3/pytorch_yolov3_holli_1x1.py
Traceback (most recent call last):
File "/home/dn-dev/tt-buda-demos/model_demos/./cv_demos/yolo_v3/pytorch_yolov3_holli_1x1.py", line 9, in
from cv_demos.yolo_v3.holli_src.yolov3 import *
File "/home/dn-dev/tt-buda-demos/model_demos/cv_demos/yolo_v3/holli_src/yolov3.py", line 5, in
from .yolov3_base import *
File "/home/dn-dev/tt-buda-demos/model_demos/cv_demos/yolo_v3/holli_src/yolov3_base.py", line 3, in
from collections import Iterable, OrderedDict, defaultdict
ImportError: cannot import name 'Iterable' from 'collections' (/usr/lib/python3.10/collections/init.py)
Expected behavior
System (please complete the following information):
Additional context
The Iterable abstract class was removed from collections in Python 3.10. See the deprecation note in the 3.9 collections docs. In the section Removed of the 3.10 docs, the item
Describe the bug
tt-buda-demos/model_demos/nlp_demos/falcon$ python3 pytorch_falcon.py
Traceback (most recent call last):
File "pytorch_falcon.py", line 6, in
from nlp_demos.falcon.utils.model import Falcon
ModuleNotFoundError: No module named 'nlp_demos'
Currently the demos in this repo aren't run in an automated fashion, as far as publicly visible. This has a few implications:
By running the demos in CI (on every change and/or on a schedule) and tracking some performance statistics, would help all these problems. It would also help sell the value and robustness of Tenstorrent hardware, by having multiple models run periodically and consistently on their hardware.
Further more, running these demos in CI for Tenstorrent firmware/drivers, could also help validate software changes in those. It's basically a form of integration testing for Tenstorrent hardware.
Describe the solution you'd like
Add in configurations to the demos to allow the use of multi-chip systems.
This can be set using an environment variable PYBUDA_OVERRIDE_NUM_CHIPS
or it maybe set using the pybuda.config._get_global_compiler_config()
.
Parameterize this variable for each test and update the tests/
directory to allow for multi-chips systems.
TT-Buda, developed by Tenstorrent, is a growing collection of model demos showcasing the capabilities of AI models running on Tenstorrent hardware. These demonstrations cover a wide range of applications, aiming to provide insights and inspiration for developers and researchers interested in advanced AI implementations.
We are excited to announce a bounty for contributing a new AI model demonstration to the TT-Buda repository. This is an opportunity for AI enthusiasts, researchers, and developers to showcase their skills, contribute to cutting-edge AI research, and earn rewards.
Integrate Gemma 2B into the TT-Buda demonstrations.
model_demos
folder following the naming convention: model_yourModelName
.CONTRIBUTING.md
file.Contributions will be evaluated by the Tenstorrent team, and the best contribution will be eligible for $500 cash bounty.
Dive into AI development with the Grayskull DevKit, your gateway to exploring Tenstorrent's hardware. Paired with TT-Buda and TT-Metalium software approaches, it offers a solid foundation for AI experimentation. Secure your kit here.
Join our Discord to talk AI, share your journey, and get support from the Tenstorrent community and team. Let's innovate together!
TT-Buda model demos, developed by Tenstorrent, is a growing collection of model demos showcasing the capabilities of AI models running on Tenstorrent hardware. These demonstrations cover a wide range of applications, aiming to provide insights and inspiration for developers and researchers interested in advanced AI implementations.
We are excited to announce a bounty for contributing a new AI model demonstration to the TT-Buda repository. This is an opportunity for AI enthusiasts, researchers, and developers to showcase their skills, contribute to cutting-edge AI research, and earn rewards.
Integrate Qwen-1.5 (0.5B) into the TT-Buda model demonstrations.
model_demos
folder following the naming convention: model_yourModelName
.CONTRIBUTING.md
file.Contributions will be evaluated by the Tenstorrent team, and the best contribution will be eligible for $500 bounty.
Dive into AI development with the Grayskull DevKit, your gateway to exploring Tenstorrent's hardware. Paired with TT-Buda and TT-Metalium software approaches, it offers a solid foundation for AI experimentation. Secure your kit here.
Join our Discord to talk AI, share your journey, and get support from the Tenstorrent community and team. Let's innovate together!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.