A tiny deep learning training framework implemented from scratch in C++ that follows PyTorch's API.
☆160Jan 29, 2026Updated last month
Alternatives and similar repositories for TinyTorch
Users that are interested in TinyTorch are comparing it to the libraries listed below
Sorting:
- Tiny C++ LLM inference implementation from scratch☆106Jan 29, 2026Updated last month
- A std::execution style runtime context and High Performance RPC Transport for using OpenUCX. Including CUDA/ROCM/... devices with RDMA.☆29Feb 22, 2026Updated 2 weeks ago
- Aligntune : A Modular Toolkit for Post Training Alignment of LLMs☆35Feb 26, 2026Updated last week
- ☆33Jul 23, 2024Updated last year
- 3D Game Engine.☆22Nov 18, 2024Updated last year
- 晚上下班不刷手机,学点什么。系列一:CUDA 计算框架 CUFX (Cuda Framework eXtended)。☆16Dec 15, 2024Updated last year
- Examples and exercises from the book Programming Massively Parallel Processors - A Hands-on Approach. David B. Kirk and Wen-mei W. Hwu (T…☆77Jan 21, 2021Updated 5 years ago
- A practical way of learning Swizzle☆37Feb 3, 2025Updated last year
- An object detection codebase based on MegEngine.☆28Dec 14, 2022Updated 3 years ago
- FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.☆56Feb 6, 2026Updated last month
- A toy Python DL training library with PyTorch like API☆38Sep 23, 2025Updated 5 months ago
- Android demo for dabnn☆20Oct 18, 2019Updated 6 years ago
- ☆26May 7, 2021Updated 4 years ago
- This is our Compiler Design project for 6th semester.☆12May 15, 2022Updated 3 years ago
- Gensis is a lightweight deep learning framework written from scratch in Python, with Triton as its backend for high-performance computing…☆37Jan 15, 2026Updated last month
- DeeperGEMM: crazy optimized version☆74May 5, 2025Updated 10 months ago
- This is a repository to practice multi-thread programming in C++☆28Feb 21, 2024Updated 2 years ago
- ☆120Apr 11, 2024Updated last year
- GitHub Action for radar - a static analysis tool for rust, anchor, stylus, and solidity smart contracts.☆10Feb 18, 2026Updated 2 weeks ago
- A feature-rich concurrency kit, yet another DAG framework☆10Jan 18, 2026Updated last month
- TensorRT encapsulation, learn, rewrite, practice.☆30Oct 19, 2022Updated 3 years ago
- c++实现的clip推理,模型有一点点改动,但是不大,改动和导出模型的代码可以在readme里找到,模型文件都在Releases里,包括AX650的模型。新增支持ChineseCLIP☆30Jun 19, 2025Updated 8 months ago
- ☆77Nov 5, 2024Updated last year
- Triton Compiler related materials.☆42Jan 4, 2025Updated last year
- Core contracts for the trading/liquidity in PaintSwap☆11Jan 7, 2025Updated last year
- Paster core module using KiteX☆10Aug 30, 2023Updated 2 years ago
- ☆10Jun 24, 2020Updated 5 years ago
- flash attention tutorial written in python, triton, cuda, cutlass☆490Jan 20, 2026Updated last month
- rag-pinecone-ray☆11Aug 14, 2023Updated 2 years ago
- Move to https://github.com/The-Pocket/PocketFlow-Rust☆10May 7, 2025Updated 10 months ago
- A job management system for python☆10Updated this week
- Explains the conclusions of a logic program.☆10May 25, 2023Updated 2 years ago
- Official repository for the paper Local Linear Attention: An Optimal Interpolation of Linear and Softmax Attention For Test-Time Regressi…☆23Oct 1, 2025Updated 5 months ago
- Train I3D on NTU-RGB+D dataset in keras☆12Feb 5, 2019Updated 7 years ago
- Documentation pages for JimFawcett repositories☆10Jun 27, 2025Updated 8 months ago
- ☆11Jun 15, 2019Updated 6 years ago
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆51Jul 4, 2025Updated 8 months ago
- ☆20Sep 11, 2025Updated 5 months ago
- LUKSO dApps template in Next.js☆11Jan 7, 2025Updated last year