Run GGML models with Kubernetes.
☆175Dec 17, 2023Updated 2 years ago
Alternatives and similar repositories for ggml-k8s
Users that are interested in ggml-k8s are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Just a bunch of benchmark logs for different LLMs☆124Jul 28, 2024Updated last year
- ☆122Dec 18, 2024Updated last year
- A collection of text embedding experiments☆55Feb 27, 2023Updated 3 years ago
- A single notebook for fine-tuning GPT-3.5 turbo☆31Aug 16, 2024Updated last year
- ☆65Dec 8, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆10Jul 17, 2023Updated 2 years ago
- Port of Facebook's LLaMA model in C/C++☆16Jul 3, 2023Updated 2 years ago
- ☆56Mar 7, 2023Updated 3 years ago
- Let's make sand talk☆588Oct 17, 2023Updated 2 years ago
- Guess the Hacker News titles☆12Mar 24, 2022Updated 4 years ago
- ☆335Dec 20, 2022Updated 3 years ago
- Full finetuning of large language models without large memory requirements☆94Sep 22, 2025Updated 7 months ago
- ☆19Apr 4, 2023Updated 3 years ago
- ☆10Oct 24, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Simplex Random Feature attention, in PyTorch☆76Oct 10, 2023Updated 2 years ago
- ☆868Dec 8, 2023Updated 2 years ago
- ☆3,364Feb 25, 2024Updated 2 years ago
- Eh, simple and works.☆27Dec 9, 2023Updated 2 years ago
- An mlx project to train a base model on your whatsapp chats using (Q)Lora finetuning☆173Jan 14, 2024Updated 2 years ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆53Feb 27, 2025Updated last year
- Extracts structured data from unstructured input. Programming language agnostic. Uses llama.cpp☆45May 16, 2024Updated last year
- GGML implementation of BERT model with Python bindings and quantization.☆57Feb 19, 2024Updated 2 years ago
- LLM-based code completion engine☆192Jan 23, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A common protocol for AI agent tools☆10Oct 21, 2024Updated last year
- batched loras☆351Sep 6, 2023Updated 2 years ago
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆280Nov 3, 2023Updated 2 years ago
- Giving full autonomy to AI agents on X☆40Nov 14, 2025Updated 5 months ago
- ☆16Nov 13, 2023Updated 2 years ago
- GPU accelerated client-side embeddings for vector search, RAG etc.☆65Dec 4, 2023Updated 2 years ago
- Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.☆6,204Aug 22, 2025Updated 8 months ago
- A really tiny autograd engine☆100May 26, 2025Updated 11 months ago
- Chatbot for The Carbon Almanac book or a climate change related topic☆16Mar 6, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Port of Facebook's LLaMA model in C/C++☆21Nov 6, 2023Updated 2 years ago
- A Python package to dynamically load functions for OpenAI Assistant☆55Dec 6, 2023Updated 2 years ago
- Code to create bugged python scripts for OpenAssistant Training, maintained by https://twitter.com/Cyndesama☆24Jul 23, 2023Updated 2 years ago
- ☆606Aug 23, 2024Updated last year
- Chrome Extension for YouTube. Acts as an assistant for the YouTube video you are watching☆23Apr 26, 2023Updated 3 years ago
- Speech-to-text transcription VST3/ARA plugin☆59Apr 13, 2026Updated 3 weeks ago
- ☆45Oct 13, 2023Updated 2 years ago