Comments (1)
Yes! I'm planning to look into using ThunderKittens once I've got more time (probably 2nd week of June). I'm not sure there's much point using it for kernels that don't use the tensor core though? But it might allow fusing even more things together (e.g. matmul and fused classifier maybe)
My plan was to mostly focus on making a hyper-optimised path for H100 using TMA though... But we'll see what happens :)
from llm.c.
Related Issues (20)
- compute sanitizers HOT 1
- Broader vendor support for hardware acceleration HOT 5
- 2D and 3D tile divisions so that permutation coordinates can be read from threadIdx and blockIdx HOT 3
- Mismatch of dweight at layernorm_backward.cu
- Recalculating the activations in the backwards pass to conserve memory HOT 3
- Deleting Conda/Python as a dependency entirely to dramatically decrease "latency to step" HOT 4
- python dev/data/fineweb.py --version 10B HOT 2
- BitNet (b1.58) support HOT 2
- Cudnn error cudnn_att.cpp on train_gptcu HOT 4
- Model Export & Inference HOT 3
- Modal script - benchmarking, profiling and libraries HOT 6
- ERROR on the AMD GPU HOT 4
- apparent compatibility issues with earlier c++ versions after recent pushes HOT 3
- I can not understand the `cublasGemmStridedBatchedEx` call in the `attention_forward`
- LLM.c in google colab HOT 1
- Running `quick start on CPU` on Macbook Pro M2 HOT 7
- OSError: Memory mapping file failed: Cannot allocate memory HOT 1
- is max_seq_len configurable or hardcoded parameter? HOT 2
- sel4 + llm.c > path to putting these llms in any mission critical system
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llm.c.