Run GGML models with Kubernetes.
☆177Dec 17, 2023Updated 2 years ago
Alternatives and similar repositories for ggml-k8s
Users that are interested in ggml-k8s are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Just a bunch of benchmark logs for different LLMs☆130Jul 28, 2024Updated last year
- ☆126Dec 18, 2024Updated last year
- A collection of text embedding experiments☆55Feb 27, 2023Updated 3 years ago
- A single notebook for fine-tuning GPT-3.5 turbo☆31Aug 16, 2024Updated last year
- ☆68Dec 8, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆10Jul 17, 2023Updated 2 years ago
- Port of Facebook's LLaMA model in C/C++☆16Jul 3, 2023Updated 3 years ago
- ☆55Mar 7, 2023Updated 3 years ago
- Let's make sand talk☆590Oct 17, 2023Updated 2 years ago
- Guess the Hacker News titles☆13Mar 24, 2022Updated 4 years ago
- Raspberry-based E-Paper Smart Home Display Project☆21Apr 13, 2026Updated 2 months ago
- ☆337Dec 20, 2022Updated 3 years ago
- Full finetuning of large language models without large memory requirements☆93Sep 22, 2025Updated 9 months ago
- ☆19Apr 4, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆10Oct 24, 2024Updated last year
- Simplex Random Feature attention, in PyTorch☆76Oct 10, 2023Updated 2 years ago
- ☆866Dec 8, 2023Updated 2 years ago
- ☆3,357Feb 25, 2024Updated 2 years ago
- [WIP] AI Try-On plugin for Chrome☆28Mar 16, 2024Updated 2 years ago
- Eh, simple and works.☆27Dec 9, 2023Updated 2 years ago
- ☆51Jan 31, 2024Updated 2 years ago
- The Codec 2 speech codec, compiled to WASM using Emscripten.☆13Apr 27, 2023Updated 3 years ago
- An mlx project to train a base model on your whatsapp chats using (Q)Lora finetuning☆173Jan 14, 2024Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- ANE accelerated embedding models!☆20Dec 11, 2024Updated last year
- Seamless Voice Interactions with LLMs☆12Oct 28, 2023Updated 2 years ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆54Feb 27, 2025Updated last year
- Inference Vision Transformer (ViT) in plain C/C++ with ggml☆31Nov 23, 2023Updated 2 years ago
- Extracts structured data from unstructured input. Programming language agnostic. Uses llama.cpp☆45May 16, 2024Updated 2 years ago
- GGML implementation of BERT model with Python bindings and quantization.☆57Feb 19, 2024Updated 2 years ago
- LLM-based code completion engine☆195Jan 23, 2025Updated last year
- batched loras☆350Sep 6, 2023Updated 2 years ago
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆279Nov 3, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Giving full autonomy to AI agents on X☆42Nov 14, 2025Updated 7 months ago
- ☆16Nov 13, 2023Updated 2 years ago
- CLI to the Cedana Service☆59May 5, 2025Updated last year
- GPU accelerated client-side embeddings for vector search, RAG etc.☆65Dec 4, 2023Updated 2 years ago
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆6,224Aug 22, 2025Updated 10 months ago
- A really tiny autograd engine☆100May 26, 2025Updated last year
- Chatbot for The Carbon Almanac book or a climate change related topic☆16Mar 6, 2023Updated 3 years ago