Tensor parallelism is all you need. Run LLMs on an AI cluster at home using any device. Distribute the workload, divide RAM usage, and increase inference speed.
☆18Nov 11, 2024Updated last year
Alternatives and similar repositories for distributed-llama
Users that are interested in distributed-llama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- an unofficial Georgia Tech theme for JupyterLab☆10Jun 29, 2021Updated 4 years ago
- Handle Android "draw over other apps" permissions and queries in a version-agnostic way☆22Jun 17, 2019Updated 6 years ago
- A disitributed implementation of alphafold3 base on xfold and tpp-pytorch-extension☆12Mar 26, 2026Updated 2 weeks ago
- MiniLM (BERT) embeddings from scratch☆19Aug 14, 2025Updated 7 months ago
- ☆13Dec 3, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆14Sep 4, 2024Updated last year
- 🎥🎯 Tracking dart coordinates with fastai v2☆11Jan 14, 2024Updated 2 years ago
- meson android build PoC☆11Oct 29, 2019Updated 6 years ago
- ☆20May 30, 2025Updated 10 months ago
- A proxy that hosts multiple single-model runners such as LLama.cpp and vLLM☆13May 30, 2025Updated 10 months ago
- Tries to UI development. Clone of https://www.perplexity.ai/☆11Sep 30, 2023Updated 2 years ago
- ☆10Dec 29, 2024Updated last year
- A MetaMask-fork to support pluggable identity contracts.☆11Dec 30, 2022Updated 3 years ago
- Ubuntu 24.04.x OS image builder for various RK3506 SBC☆26Apr 4, 2026Updated last week
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Play with OpenAI API's using your own API Key. Your API Key is stored and used only from your browser.☆15Dec 20, 2025Updated 3 months ago
- Bloom filter alternative (C++)☆18Nov 8, 2018Updated 7 years ago
- Unofficial Claude Code SDKs for Typescript and Python☆15May 20, 2025Updated 10 months ago
- ☆19May 15, 2023Updated 2 years ago
- This repository is intended as a comprehensive guide to prepare for interviews focused on generative AI. It serves as a one-stop resource…☆11Dec 13, 2024Updated last year
- lossily compress representation vectors using product quantization☆59Oct 28, 2025Updated 5 months ago
- Video: https://youtu.be/hpR1vvaQJaM☆15May 1, 2025Updated 11 months ago
- ☆15Mar 6, 2021Updated 5 years ago
- Source code for Youtube tutorial series on chest X-ray auto diagnosis☆13Sep 26, 2020Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Tools for merging pretrained large language models.☆19Jun 12, 2024Updated last year
- Deploy fastai models with Docker☆19Sep 27, 2020Updated 5 years ago
- ☆16Nov 24, 2025Updated 4 months ago
- Codebase for VidHal: Benchmarking Hallucinations in Vision LLMs☆14Apr 19, 2025Updated 11 months ago
- A Gentle Introduction to Transformers Neural Network☆15Mar 3, 2024Updated 2 years ago
- Simple readonly FUSE driver for FAT filesystems☆12Jan 27, 2016Updated 10 years ago
- ☆18Sep 7, 2025Updated 7 months ago
- Additional functionality for use with fastai’s medical imaging module☆15Jul 20, 2022Updated 3 years ago
- My stack☆26Jul 10, 2025Updated 9 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆30Nov 5, 2024Updated last year
- Multivariate Bayesian Structural Time Series in Stan☆13Apr 13, 2020Updated 5 years ago
- Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)☆12Mar 12, 2024Updated 2 years ago
- Low-latency ASR using SpeechBrain StreamingASR and torchaudio StreamReader.☆18Apr 19, 2025Updated 11 months ago
- ☆19Jul 24, 2025Updated 8 months ago
- Kubernetes deployment strategies from "DB Schemas & Kubernetes Rollouts" blogpost☆20May 8, 2019Updated 6 years ago
- LLM inference in C/C++☆21Apr 2, 2026Updated last week