InfiniTensor / InfiniLM-RustLinks
☆125Updated last month
Alternatives and similar repositories for InfiniLM-Rust
Users that are interested in InfiniLM-Rust are comparing it to the libraries listed below
Sorting:
- ☆65Updated last year
- 算子库☆17Updated 4 months ago
- ☆274Updated last month
- 笔记☆47Updated 3 months ago
- easy cuda code☆90Updated 11 months ago
- ☆29Updated last month
- ☆37Updated 10 months ago
- A domain-specific language (DSL) based on Triton but providing higher-level abstractions.☆36Updated this week
- ☆63Updated 10 months ago
- 实验:rust 实现 llama2 推理☆16Updated last year
- Codes & examples for "CUDA - From Correctness to Performance"☆117Updated last year
- 分层解耦的深度学习推理引擎☆76Updated 9 months ago
- 算子库(Rust)☆14Updated 4 months ago
- Triton Documentation in Chinese Simplified / Triton 中文文档☆91Updated last week
- 训练营讲义☆21Updated 10 months ago
- A PyTorch-like deep learning framework. Just for fun.☆156Updated 2 years ago
- 《 自己动手写AI编译器》☆31Updated last year
- LLM Inference via Triton (Flexible & Modular): Focused on Kernel Optimization using CUBIN binaries, Starting from gpt-oss Model☆56Updated last month
- Flash Attention from Scratch on CUDA Ampere☆76Updated 3 months ago
- ☆70Updated 2 years ago
- Large-scale Auto-Distributed Training/Inference Unified Framework | Memory-Compute-Control Decoupled Architecture | Multi-language SDK & …☆55Updated 4 months ago
- Fast OS-level support for GPU checkpoint and restore☆257Updated 2 months ago
- RustSBI Specialized Domain Knowledge Quiz LLM☆104Updated last month
- A prefill & decode disaggregated LLM serving framework with shared GPU memory and fine-grained compute isolation.☆116Updated 6 months ago
- FlagTree is a unified compiler for multiple AI chips, which is forked from triton-lang/triton.☆137Updated this week
- Implement custom operators in PyTorch with cuda/c++☆74Updated 2 years ago
- 使用 CUDA C++ 实现的 llama 模型推理框架☆62Updated last year
- CUDA SGEMM optimization note☆15Updated 2 years ago
- ☆48Updated 3 years ago
- Wiki fo HPC☆123Updated 4 months ago