tenstorrent / tt-buda-demos Goto Github PK

View Code? Open in Web Editor NEW

39.0 39.0 12.0 286 KB

Repository of model demos using TT-Buda

Jupyter Notebook 10.97% Makefile 0.28% Python 88.75%

tt-buda-demos's People

Contributors

Stargazers

Watchers

Forkers

ashokkumarkannan1 sinasun jonathanalevine saichanda jushbjj anirudtt tt-mjudge tt-bounty-hunters pauldelarosa mvkvc kamalrajkannan78 chandrasekaranpradeep

tt-buda-demos's Issues

Introduce a github actions workflow - for auto linting , cleaning.

Is your feature request related to a problem? Please describe.
As we have open-source our TT-Buda demos and benchmarking repositories, we anticipate a growing community of contributors. Maintaining high code quality and consistent formatting becomes paramount in this collaborative environment.

Describe the solution you'd like
To ensure this, we're implementing a GitHub Actions workflow triggered whenever a pull request is opened in our repositories.

Additional context
Github Actions - Auto Commit Linting , Cleaning

[$$ BOUNTY] Add Phi-2 (2.7B) Model to TT-Buda Model Demos

Background:

TT-Buda, developed by Tenstorrent, is a growing collection of model demos showcasing the capabilities of AI models running on Tenstorrent hardware. These demonstrations cover a wide range of applications, aiming to provide insights and inspiration for developers and researchers interested in advanced AI implementations.

Bounty Objective:

We are excited to announce a bounty for contributing a new AI model demonstration to the TT-Buda repository. This is an opportunity for AI enthusiasts, researchers, and developers to showcase their skills, contribute to cutting-edge AI research, and earn rewards.

Task Details:

Integrate Phi-2 (2.7B) into the TT-Buda demonstrations.

Requirements:

The submission must include a comprehensive README.md detailing the model's architecture, implementation details, and usage instructions.
The model should be fully functional and tested on Tenstorrent hardware, ensuring compatibility and performance optimization.
Include sample inputs and outputs, demonstrating the model's capabilities.
Provide documentation on any dependencies and installation procedures.

Contribution Guidelines:

Fork the TT-Buda model demos repository.
Create a new directory within the model_demos folder following the naming convention: model_yourModelName.
Ensure your code adheres to the coding standards and guidelines provided in the repository's CONTRIBUTING.md file.
Submit a pull request with a detailed description of your model and any relevant information that will help reviewers understand and evaluate your contribution.

Evaluation Criteria:

Innovation and Relevance: How does the model contribute new ideas or solutions? Is it relevant to current challenges in AI?
Implementation Quality: Code readability, structure, and adherence to best practices.
Performance: Efficiency and performance on Tenstorrent hardware.
Documentation: Clarity and completeness of the accompanying documentation.

Rewards:

Contributions will be evaluated by the Tenstorrent team, and the best contribution will be eligible for $500 cash bounty.

Get Started with Grayskull DevKit

Dive into AI development with the Grayskull DevKit, your gateway to exploring Tenstorrent's hardware. Paired with TT-Buda and TT-Metalium software approaches, it offers a solid foundation for AI experimentation. Secure your kit here.

Connect on Discord

Join our Discord to talk AI, share your journey, and get support from the Tenstorrent community and team. Let's innovate together!

Add in a license header checker similar to tt-benchmarking

Feature Request: Ensuring SPDX License Compliance for Existing Files

Is your feature request related to a problem? Please describe.

We want to implement a feature that allows us to ensure current files already have the correct SPDX licenses and that any new PR's being merged will also adhere to this standard.

Describe the solution you'd like

Before integrating the new functionality for screening new pull requests for SPDX license compliance, we need to conduct an audit on existing files. This audit will verify that all current files in the repository are compliant with their respective SPDX licenses or existing license(s).

Describe alternatives you've considered

One alternative approach is to manually check in PR. This method might be prone to human error.

Additional context

This proactive measure will aid in preventing any future issues regarding licensing and will ensure that all contributions adhere to the required licensing standards.

Implementation Details

The solution will be similar to the a approach currently being implement in the tenstorrent benchmarking repository currently open as a PR.

Bert Demo Error(s)

Describe the bug
Errors encountered when attempting to run the Bert NLP demo.

To Reproduce
Steps to reproduce the behavior:
from:https://github.com/tenstorrent/tt-buda-demos/blob/main/first_5_steps/2_running_nlp_models.ipynb

python3 --version
Python 3.8.10
python3 bert_buda.py

Expected behavior

Screenshots

System (please complete the following information):

Device: [e.g. E150]
Version [e.g. v0.9.75]

OS: Ubuntu 20.04 focal
Kernel: x86_64 Linux 5.15.0-102-generic
Uptime: 46m
Packages: 1950
Shell: bash 5.0.17
Disk: 27G / 1.9T (2%)
CPU: AMD Ryzen 9 7950X3D 16-Core @ 32x 4.2GHz
GPU: AMD/ATI
RAM: 3278MiB / 192425MiB

Additional context
Add any other context about the problem here.

2024-04-16 05:21:39.757 | DEBUG | pybuda.ttdevice:_create_intermediates_queue_device_connector:1421 - Creating fwd intermediates queue connector on TTDevice 'tt_device_0'
2024-04-16 05:21:39.757 | DEBUG | pybuda.ttdevice:_create_forward_output_queue_device_connector:1401 - Creating forward output queue connector on TTDevice 'tt_device_0'
2024-04-16 05:21:39.757 | ERROR | pybuda.run.impl:_start_device_processes:1180 - Process spawn error:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

2024-04-16 05:21:39.757 | DEBUG | pybuda.run.impl:_shutdown:1265 - PyBuda shutdown
2024-04-16 05:21:39.757 | DEBUG | pybuda.run.impl:_shutdown:1281 - Waiting until processes done
2024-04-16 05:21:39.757 | WARNING | pybuda.tensor:pytorch_dtype_to_buda_dataformat:594 - Parameter is int64. Setting to int8 for now.
2024-04-16 05:21:39.757 | WARNING | pybuda.tensor:pytorch_dtype_to_buda_dataformat:594 - Parameter is int64. Setting to int8 for now.
2024-04-16 05:21:39.757 | WARNING | pybuda.tensor:pytorch_dtype_to_buda_dataformat:594 - Parameter is int64. Setting to int8 for now.
2024-04-16 05:21:39.757 | WARNING | pybuda.tensor:pytorch_dtype_to_buda_dataformat:594 - Parameter is int64. Setting to int8 for now.
2024-04-16 05:21:39.757 | WARNING | pybuda.tensor:pytorch_dtype_to_buda_dataformat:594 - Parameter is int64. Setting to int8 for now.
2024-04-16 05:21:39.758 | WARNING | pybuda.tensor:pytorch_dtype_to_buda_dataformat:594 - Parameter is int64. Setting to int8 for now.
2024-04-16 05:21:39.858 | DEBUG | pybuda.device:get_command_queue_response:311 - Ending process on CPUDevice 'cpu0_fallback' due to shutdown event
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/usr/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/usr/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/usr/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/usr/lib/python3.8/runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "/usr/lib/python3.8/runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/smduck/Downloads/pyb/bert_buda.py", line 59, in
output_q = pybuda.run_inference() # executes compilation (if first time) + runtime
File "/home/smduck/Downloads/pyb-20-04/env/lib/python3.8/site-packages/pybuda/run/api.py", line 90, in run_inference
return _run_inference(module, inputs, input_count, output_queue, _sequential, _perf_trace, _verify_cfg)
File "/home/smduck/Downloads/pyb-20-04/env/lib/python3.8/site-packages/pybuda/run/impl.py", line 277, in _run_inference
return _run_devices_inference(
File "/home/smduck/Downloads/pyb-20-04/env/lib/python3.8/site-packages/pybuda/run/impl.py", line 467, in _run_devices_inference
output_queue = _initialize_pipeline(False, output_queue, sequential=sequential, verify_cfg=verify_cfg)
File "/home/smduck/Downloads/pyb-20-04/env/lib/python3.8/site-packages/pybuda/run/impl.py", line 414, in _initialize_pipeline
_compile_devices(sequential, training=training, sample_inputs=sample_inputs, sample_targets=sample_targets, microbatch_count=microbatch_count, verify_cfg=verify_cfg)
File "/home/smduck/Downloads/pyb-20-04/env/lib/python3.8/site-packages/pybuda/run/impl.py", line 1250, in _compile_devices
raise RuntimeError(f"Compile failed for {d}")
RuntimeError: Compile failed for CPUDevice 'cpu0_fallback'
2024-04-16 05:21:39.861 | DEBUG | pybuda.run.impl:_shutdown:1265 - PyBuda shutdown
2024-04-16 05:21:39.861 | DEBUG | pybuda.run.impl:_shutdown:1281 - Waiting until processes done
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/home/smduck/Downloads/pyb-20-04/env/lib/python3.8/site-packages/pybuda/run/api.py", line 475, in shutdown
return _shutdown()
File "/home/smduck/Downloads/pyb-20-04/env/lib/python3.8/site-packages/pybuda/run/impl.py", line 1294, in _shutdown
p.terminate()
File "/usr/lib/python3.8/multiprocessing/process.py", line 133, in terminate
self._popen.terminate()
AttributeError: 'NoneType' object has no attribute 'terminate'

Add Llama 3 8B to TT-Buda model demos

Meta's Llama 3 is currently the leading edge of <10B models and is highly rated on the user-preference based Chatbot Arena. Being able to run inference of such a powerful model could greatly increase the value and percieved potential of Tenstorrent's Grayskull accelerators.

Quantization is needed to fit an 8B model on the 8GB LPDDR4 of the Grayskull cards. This might be possible with the block floating point format Grayskull supports, specifically BFP4. This would make it probably more complicated than adding other models.

Given the complexity and the benefits that enabling inference of such a SOTA model would be, prioritising it internally or putting a significant bounty on it could be considered.

yolov3_holli_1x1.py is not work.

Describe the bug

pytorch_yolov3_holli_1x1.py is not work.

(python_env) dn-dev@grayskull-dev:~/tt-buda-demos/model_demos$ python3 ./cv_demos/yolo_v3/pytorch_yolov3_holli_1x1.py
Traceback (most recent call last):
File "/home/dn-dev/tt-buda-demos/model_demos/./cv_demos/yolo_v3/pytorch_yolov3_holli_1x1.py", line 9, in
from cv_demos.yolo_v3.holli_src.yolov3 import *
File "/home/dn-dev/tt-buda-demos/model_demos/cv_demos/yolo_v3/holli_src/yolov3.py", line 5, in
from .yolov3_base import *
File "/home/dn-dev/tt-buda-demos/model_demos/cv_demos/yolo_v3/holli_src/yolov3_base.py", line 3, in
from collections import Iterable, OrderedDict, defaultdict
ImportError: cannot import name 'Iterable' from 'collections' (/usr/lib/python3.10/collections/init.py)

Expected behavior

System (please complete the following information):

Device: E75
Host information: AMD Ryzen 5 7600
OS : 22.04
Version : v0.10.9.gs.240401-alpha

Additional context

python 3.10

The Iterable abstract class was removed from collections in Python 3.10. See the deprecation note in the 3.9 collections docs. In the section Removed of the 3.10 docs, the item

pytorch_falcon.py error

Describe the bug

tt-buda-demos/model_demos/nlp_demos/falcon$ python3 pytorch_falcon.py
Traceback (most recent call last):
File "pytorch_falcon.py", line 6, in
from nlp_demos.falcon.utils.model import Falcon
ModuleNotFoundError: No module named 'nlp_demos'

Device: [e.g. E150]
OS [e.g. Ubuntu 20.04]
Version [e.g. v0.9.75]
Other relevant information
python3 --version
Python 3.8.10

Run TT-Buda Demos in CI on Tenstorrent hardware

Currently the demos in this repo aren't run in an automated fashion, as far as publicly visible. This has a few implications:

It's unproven if the examples in the repo actually work as of right now
Changes to the models by (external) contributors aren't systematically validated
For (new) contributors, possibly without Tenstorrent hardware, it's harder to develop and test new models
The current performance and performance regressions (either caused by changes in the models themselves or by hardware) aren't tracked in any way.

By running the demos in CI (on every change and/or on a schedule) and tracking some performance statistics, would help all these problems. It would also help sell the value and robustness of Tenstorrent hardware, by having multiple models run periodically and consistently on their hardware.

Further more, running these demos in CI for Tenstorrent firmware/drivers, could also help validate software changes in those. It's basically a form of integration testing for Tenstorrent hardware.

Add configuration to demos and tests for multi-chip systems

Describe the solution you'd like
Add in configurations to the demos to allow the use of multi-chip systems.

This can be set using an environment variable PYBUDA_OVERRIDE_NUM_CHIPS or it maybe set using the pybuda.config._get_global_compiler_config().

Parameterize this variable for each test and update the tests/ directory to allow for multi-chips systems.