danielgross/ggml-k8s

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/danielgross/ggml-k8s)

danielgross / ggml-k8s

Run GGML models with Kubernetes.

☆177

Alternatives and similar repositories for ggml-k8s

Users that are interested in ggml-k8s are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

teknium1 / ShareGPT-Builder
View on GitHub
☆126Dec 18, 2024Updated last year
teknium1 / LLM-Benchmark-Logs
View on GitHub
Just a bunch of benchmark logs for different LLMs
☆130Jul 28, 2024Updated last year
RyanLucas3 / poasterGPT
View on GitHub
A single notebook for fine-tuning GPT-3.5 turbo
☆31Aug 16, 2024Updated last year
strangeloopcanon / tevo
View on GitHub
TEVO: evolve LM motifs cheaply, then validate them in downstream train.py loops.
☆19Apr 18, 2026Updated 3 months ago
NousResearch / StripedHyenaTrainer
View on GitHub
☆67Dec 8, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
yacineMTB / llama.cpp
View on GitHub
Port of Facebook's LLaMA model in C/C++
☆16Jul 3, 2023Updated 3 years ago
rlancemartin / doc-gpt
View on GitHub
☆55Mar 7, 2023Updated 3 years ago
spirobel / bunny-llama
View on GitHub
iterate quickly with llama.cpp hot reloading. use the llama.cpp bindings with bun.sh
☆51Oct 30, 2023Updated 2 years ago
yacineMTB / talk
View on GitHub
Let's make sand talk
☆590Oct 17, 2023Updated 2 years ago
notarussianteenager / srf-attention
View on GitHub
Simplex Random Feature attention, in PyTorch
☆76Oct 10, 2023Updated 2 years ago
euclaise / SlimTrainer
View on GitHub
Full finetuning of large language models without large memory requirements
☆92Sep 22, 2025Updated 10 months ago
danielgross / teleprompter
View on GitHub
☆337Dec 20, 2022Updated 3 years ago
deepshard / mixtral-8x7b-Inference
View on GitHub
Eh, simple and works.
☆27Dec 9, 2023Updated 2 years ago
danielgross / localpilot
View on GitHub
☆3,352Feb 25, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
jbilcke-hf / atryon
View on GitHub
[WIP] AI Try-On plugin for Chrome
☆28Mar 16, 2024Updated 2 years ago
gavi / mlx-whatsapp
View on GitHub
An mlx project to train a base model on your whatsapp chats using (Q)Lora finetuning
☆174Jan 14, 2024Updated 2 years ago
distantmagic / structured
View on GitHub
Extracts structured data from unstructured input. Programming language agnostic. Uses llama.cpp
☆45May 16, 2024Updated 2 years ago
eryk-mazus / sigh
View on GitHub
Seamless Voice Interactions with LLMs
☆12Oct 28, 2023Updated 2 years ago
ggerganov / vit.cpp
View on GitHub
Inference Vision Transformer (ViT) in plain C/C++ with ggml
☆31Nov 23, 2023Updated 2 years ago
strangeloopcanon / contrail
View on GitHub
☆27Jun 16, 2026Updated last month
ggml-org / p1
View on GitHub
LLM-based code completion engine
☆194Jan 23, 2025Updated last year
mistralai / megablocks-public
View on GitHub
☆865Dec 8, 2023Updated 2 years ago
sabetAI / BLoRA
View on GitHub
batched loras
☆350Sep 6, 2023Updated 2 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
agentsea / toolfuse
View on GitHub
A common protocol for AI agent tools
☆10Oct 21, 2024Updated last year
Mihaiii / trivia
View on GitHub
A live multiplayer trivia game where users can bid for the subject of the next question
☆29Jan 9, 2026Updated 6 months ago
cedana / cedana-cli
View on GitHub
CLI to the Cedana Service
☆60May 5, 2025Updated last year
FL33TW00D / embd
View on GitHub
GPU accelerated client-side embeddings for vector search, RAG etc.
☆65Dec 4, 2023Updated 2 years ago
arnavc1712 / YouQ
View on GitHub
Chrome Extension for YouTube. Acts as an assistant for the YouTube video you are watching
☆22Apr 26, 2023Updated 3 years ago
camenduru / daclip-uir-colab
View on GitHub
☆13Oct 12, 2023Updated 2 years ago
joey00072 / Tinytorch
View on GitHub
A really tiny autograd engine
☆100May 26, 2025Updated last year
yoheinakajima / captainfunction
View on GitHub
A Python package to dynamically load functions for OpenAI Assistant
☆55Dec 6, 2023Updated 2 years ago
teknium1 / transformers-gptq-quant
View on GitHub
☆46Oct 13, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
meta-pytorch / gpt-fast
View on GitHub
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
☆6,229Aug 22, 2025Updated 11 months ago
apoorvumang / prompt-lookup-decoding
View on GitHub
Simple speculative decoding technique, integrated in vLLM and transformers
☆611Aug 23, 2024Updated last year
progremir / carbon-almanac.ai
View on GitHub
Chatbot for The Carbon Almanac book or a climate change related topic
☆16Mar 6, 2023Updated 3 years ago
prem-research / prem-operator
View on GitHub
📡 Deploy AI models and apps to Kubernetes without developing a hernia
☆33May 23, 2024Updated 2 years ago
abetlen / program-constrained-language-model-sampling
View on GitHub
☆35Apr 8, 2023Updated 3 years ago
Contextualist / lone-arena
View on GitHub
Self-hosted LLM chatbot arena, with yourself as the only judge
☆41Feb 6, 2024Updated 2 years ago
OpenAgentLLM / OpenAgent
View on GitHub
🔓 The open-source autonomous agent LLM initiative 🔓
☆91Feb 12, 2024Updated 2 years ago