dblalock/bolt

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dblalock/bolt)

dblalock / bolt

10x faster matrix and vector operations

☆2,513

Alternatives and similar repositories for bolt

Users that are interested in bolt are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

joennlae / halutmatmul
View on GitHub
Hashed Lookup Table based Matrix Multiplication (halutmatmul) - Stella Nera accelerator
☆214Dec 10, 2023Updated 2 years ago
facebookincubator / AITemplate
View on GitHub
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (N…
☆4,726Jul 14, 2026Updated 2 weeks ago
mosaicml / composer
View on GitHub
Supercharge Your Model Training
☆5,491Apr 29, 2026Updated 3 months ago
triton-lang / triton
View on GitHub
Development repository for the Triton language and compiler
☆19,812Updated this week
google / highway
View on GitHub
Performance-portable, length-agnostic SIMD with runtime dispatch
☆5,722Updated this week
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
jax-ml / jax
View on GitHub
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
☆36,070Updated this week
arogozhnikov / einops
View on GitHub
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
☆9,564Jul 5, 2026Updated 3 weeks ago
libffcv / ffcv
View on GitHub
FFCV: Fast Forward Computer Vision (and other ML workloads!)
☆2,993Jun 16, 2024Updated 2 years ago
taichi-dev / taichi
View on GitHub
Productive, portable, and performant GPU programming in Python.
☆28,310Jul 6, 2026Updated 3 weeks ago
facebookresearch / AugLy
View on GitHub
A data augmentations library for audio, image, text, and video.
☆5,086Jul 16, 2026Updated last week
Lightning-AI / pytorch-lightning
View on GitHub
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
☆31,258Updated this week
flashlight / flashlight
View on GitHub
A C++ standalone library for machine learning
☆5,462Jun 22, 2026Updated last month
facebookresearch / faiss
View on GitHub
A library for efficient similarity search and clustering of dense vectors.
☆40,593Updated this week
ELS-RD / kernl
View on GitHub
Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackab…
☆1,586Jan 28, 2026Updated 6 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
pytorch / torchdynamo
View on GitHub
A Python-level JIT compiler designed to make unmodified PyTorch programs faster.
☆1,078Apr 17, 2024Updated 2 years ago
FMInference / FlexLLMGen
View on GitHub
Running large language models on a single GPU for throughput-oriented scenarios.
☆9,364Oct 28, 2024Updated last year
microsoft / hummingbird
View on GitHub
Hummingbird compiles trained ML models into tensor computation for faster inference.
☆3,536Jul 17, 2025Updated last year
facebookresearch / torchdim
View on GitHub
Named tensors with first-class dimensions for PyTorch
☆334Jun 14, 2023Updated 3 years ago
deepspeedai / DeepSpeed
View on GitHub
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
☆42,830Updated this week
apache / tvm
View on GitHub
Open Machine Learning Compiler Framework
☆13,627Updated this week
NVlabs / tiny-cuda-nn
View on GitHub
Lightning fast C++/CUDA neural network framework
☆4,518Apr 21, 2026Updated 3 months ago
google-research / google-research
View on GitHub
Google Research
☆38,442Updated this week
kornia / kornia
View on GitHub
🐍 Geometric Computer Vision Library for Spatial AI
☆11,294Updated this week
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
facebookresearch / fairscale
View on GitHub
PyTorch extensions for high performance and large scale training.
☆3,411Apr 26, 2025Updated last year
huggingface / pytorch-image-models
View on GitHub
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights --…
☆37,024Updated this week
tinygrad / tinygrad
View on GitHub
You like pytorch? You like micrograd? You love tinygrad! ❤️
☆33,370Updated this week
exaloop / codon
View on GitHub
A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support
☆16,813Updated this week
facebookresearch / fairseq
View on GitHub
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
☆32,249Sep 30, 2025Updated 9 months ago
arrayfire / arrayfire
View on GitHub
ArrayFire: a general purpose GPU library.
☆4,897Mar 7, 2026Updated 4 months ago
pytorch / FBGEMM
View on GitHub
FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/
☆1,573Updated this week
rentruewang / aioway
View on GitHub
AI on the way. An auto deep learning pipe dream. An RDBMS approach to deep learning. Declarative, explainable, scalable, optimizable, eas…
☆1,825Updated this week
getkeops / keops
View on GitHub
KErnel OPerationS, on CPUs and GPUs, with autodiff and without memory overflows
☆1,188Jul 17, 2026Updated last week
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
bitsandbytes-foundation / bitsandbytes
View on GitHub
Accessible large language models via k-bit quantization for PyTorch.
☆8,369Updated this week
microsoft / unilm
View on GitHub
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
☆22,173Jan 23, 2026Updated 6 months ago
bytedance / lightseq
View on GitHub
LightSeq: A High Performance Library for Sequence Processing and Generation
☆3,297May 16, 2023Updated 3 years ago
cupy / cupy
View on GitHub
NumPy & SciPy for GPU
☆12,219Updated this week
bloomberg / memray
View on GitHub
Memray is a memory profiler for Python
☆15,182Updated this week
nebuly-ai / optimate
View on GitHub
A collection of libraries to optimise AI model performances
☆8,333Jul 22, 2024Updated 2 years ago
lucidrains / DALLE-pytorch
View on GitHub
Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch
☆5,628Feb 17, 2024Updated 2 years ago