8-bit CUDA functions for PyTorch
☆70Sep 24, 2025Updated 5 months ago
Alternatives and similar repositories for bitsandbytes
Users that are interested in bitsandbytes are comparing it to the libraries listed below
Sorting:
- Fast and memory-efficient exact attention☆224Updated this week
- Hackable and optimized Transformers building blocks, supporting a composable construction.☆34Feb 26, 2026Updated 3 weeks ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆113Updated this week
- Ahead of Time (AOT) Triton Math Library☆94Updated this week
- Development repository for the Triton language and compiler☆143Updated this week
- ☆24Jul 16, 2025Updated 8 months ago
- MAGELLAN: Metacognitive predictions of learning progress guide autotelic LLM agents in large goal spaces☆10Mar 24, 2025Updated 11 months ago
- LLM as World Models using Bayesian inference☆16May 27, 2025Updated 9 months ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆139Mar 13, 2026Updated last week
- Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more☆26Updated this week
- a simple Flash Attention v2 implementation with ROCM (RDNA3 GPU, roc wmma), mainly used for stable diffusion(ComfyUI) in Windows ZLUDA en…☆51Aug 25, 2024Updated last year
- AMD's graph optimization engine.☆284Updated this week
- ☆12Nov 30, 2018Updated 7 years ago
- ☆14Sep 24, 2018Updated 7 years ago
- Demo playbook for openstack Ansible roles☆15Aug 2, 2015Updated 10 years ago
- A repository for contributed EasyConfig files that LUMI users can install at their own discretion or use as a starting base for their own…☆18Updated this week
- A low-cost, high-performance deep learning training framework that enables efficient 100B-scale model fine-tuning on a commodity server w…☆24Mar 21, 2025Updated last year
- Fast and memory-efficient exact attention ported to rocm☆13Dec 1, 2023Updated 2 years ago
- Official PyTorch implementation of CD-MOE☆12Updated this week
- Genetics for Language Models☆17Jul 1, 2024Updated last year
- A non-root version of traceroute written in Rust☆16Apr 22, 2021Updated 4 years ago
- The AMD rocAL is designed to efficiently decode and process images and videos from a variety of storage formats and modify them through a…☆23Updated this week
- hipDF - GPU DataFrame Library☆16Updated this week
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆26Mar 11, 2026Updated last week
- ☆172Updated this week
- Benchmarking tool for vLLM inference performance with GPU monitoring☆42Nov 24, 2025Updated 3 months ago
- Automated Design of Agentic Systems☆10Sep 7, 2024Updated last year
- ☆26Nov 13, 2025Updated 4 months ago
- 지하철도 구구구☆10Sep 12, 2020Updated 5 years ago
- GoldFinch and other hybrid transformer components☆12Dec 9, 2025Updated 3 months ago
- Mirror only see https://gitlab.rtems.org/rtems/docs/rtems-docs/☆10Mar 13, 2026Updated last week
- Does all kind of cool stuff to make analyzing meta classes easier. Now featuring WRedLogger.py, the previous backend of NetDbg☆10Jun 7, 2023Updated 2 years ago
- Pytorch implementation for Decomposed Convolutional Filters Network☆23Feb 19, 2020Updated 6 years ago
- Image processing tool for ComfyUI☆13Aug 6, 2025Updated 7 months ago
- A recommendation model kernel optimizing system☆12Jun 5, 2025Updated 9 months ago
- ☆13Mar 10, 2026Updated last week
- HIPIFY: Convert CUDA to Portable C++ Code☆673Mar 12, 2026Updated last week
- Scripts to recover (accidentally) deleted files from ext3 partitions☆14Aug 16, 2017Updated 8 years ago
- [DEPRECATED] Moved to ROCm/rocm-libraries repo☆257Updated this week