zigzagcai Goto Github PK
Name: Season
Type: User
Location: Shanghai, China
Name: Season
Type: User
Location: Shanghai, China
Training and serving large-scale neural networks with auto parallelization.
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Causal depthwise conv1d in CUDA, with a PyTorch interface
Making large AI models cheaper, faster and more accessible
The Python programming language
This is a collection of our NAS and Vision Transformer work.
CUDA Templates for Linear Algebra Subroutines
Parallel computing with task scheduling
Deep Learning Examples
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Python package built to ease deep learning on graph, on top of existing DL frameworks.
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
Fast Hadamard transform in CUDA, with a PyTorch interface
Fast and memory-efficient exact attention
Flax is a neural network library for JAX that is designed for flexibility.
Collective communications library with various primitives for multi-machine training.
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
InternLM has open-sourced a 7 billion parameter base model, a chat model tailored for practical scenarios and the training system.
[CVPR 2024] InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks —— An Open-Source Alternative to ViT-22B
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
JAX implementation of the Llama 2 model
Port of Facebook's LLaMA model in C/C++
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
LLM training in simple, raw C/CUDA
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
An efficient pytorch implementation of selective scan in one file, works with both cpu and gpu, with corresponding mathematical derivation. It is probably the code which is the most close to selective_scan_cuda in mamba.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.