Tensor parallelism is all you need. Run LLMs on an AI cluster at home using any device. Distribute the workload, divide RAM usage, and increase inference speed.
☆18Nov 11, 2024Updated last year
Alternatives and similar repositories for distributed-llama
Users that are interested in distributed-llama are comparing it to the libraries listed below
Sorting:
- an unofficial Georgia Tech theme for JupyterLab☆10Jun 29, 2021Updated 4 years ago
- This repository is intended as a comprehensive guide to prepare for interviews focused on generative AI. It serves as a one-stop resource…☆11Dec 13, 2024Updated last year
- ☆15Mar 6, 2021Updated 4 years ago
- Modification of SOMPY repo with robust K-means clustering (bootstrapped SSE elbow method)☆13Apr 6, 2019Updated 6 years ago
- An extension to the V clipboard library with additional support☆12Jul 25, 2020Updated 5 years ago
- Build a minimal highly available (HA) Kubernetes cluster with zero effort in less than 10 minutes in Hetzner Cloud aka hcloud with Terraf…☆11Dec 10, 2025Updated 2 months ago
- OpenNNFX a community based open source repository of all things No Nonsense Forex☆17Apr 22, 2023Updated 2 years ago
- Steganography Reverse Shell☆10Apr 22, 2023Updated 2 years ago
- meson android build PoC☆11Oct 29, 2019Updated 6 years ago
- Proxy for OpenAI☆15Sep 2, 2025Updated 6 months ago
- Find the electricity market clearing price and clearing quantity (graphical method) using python.☆15Nov 22, 2021Updated 4 years ago
- Examples of Using DBTunnel☆11Apr 24, 2024Updated last year
- ☆13Dec 3, 2023Updated 2 years ago
- Python tools for Anki☆23Feb 14, 2026Updated 2 weeks ago
- ☆14Sep 4, 2024Updated last year
- Flask based Web application for predicting the income of a person☆13Dec 23, 2018Updated 7 years ago
- Chat GPT Things by Taylor Newsome☆12Mar 19, 2024Updated last year
- Play with OpenAI API's using your own API Key. Your API Key is stored and used only from your browser.☆15Dec 20, 2025Updated 2 months ago
- Sent𝕏Ment: Advanced Sentiment Analysis Tool☆13Aug 1, 2023Updated 2 years ago
- Official implementation of the paper "Distilling a Pretrained Language Model to a Multilingual ASR Model" (Interspeech 2022)☆12Mar 12, 2024Updated last year
- Code to implement Maximum Entropy Deep Inverse Reinforcement Learning.☆14Jul 3, 2020Updated 5 years ago
- a distributed end-to-end image classification system using kubernetes☆14Dec 31, 2024Updated last year
- ☆16Nov 24, 2025Updated 3 months ago
- ☆11Jun 26, 2017Updated 8 years ago
- A proxy that hosts multiple single-model runners such as LLama.cpp and vLLM☆12May 30, 2025Updated 9 months ago
- Quick occlusion culling demo using OpenGL and OpenCL☆20Apr 25, 2012Updated 13 years ago
- ☆13Oct 16, 2024Updated last year
- Tools for merging pretrained large language models.☆19Jun 12, 2024Updated last year
- Reportlab git mirror☆20Jan 30, 2023Updated 3 years ago
- Machine Learning (ML) research within medicine and healthcare represents one of the most challenging domains for both engineers and medic…☆16Aug 1, 2020Updated 5 years ago
- Probabilistic Contrastive Learning for Domain Adaptation☆15May 22, 2024Updated last year
- Low-latency ASR using SpeechBrain StreamingASR and torchaudio StreamReader.☆18Apr 19, 2025Updated 10 months ago
- An all purpose random library written in V.☆17Aug 14, 2022Updated 3 years ago
- Using Llam.cpp and onnxruntime to accelerate inference of GOT-OCR2.0☆15Mar 6, 2025Updated 11 months ago
- The official repository for AdaMuon☆35Aug 27, 2025Updated 6 months ago
- Automatic Fibonacci extensions/retracements for Machine Learning of price☆18Nov 21, 2024Updated last year
- ☆19Jul 24, 2025Updated 7 months ago
- ☆18Sep 18, 2025Updated 5 months ago
- Automate your linkedin networking using an AI agent☆17Jan 21, 2025Updated last year