Run GGML models with Kubernetes.
☆175Dec 17, 2023Updated 2 years ago
Alternatives and similar repositories for ggml-k8s
Users that are interested in ggml-k8s are comparing it to the libraries listed below
Sorting:
- Just a bunch of benchmark logs for different LLMs☆119Jul 28, 2024Updated last year
- ☆119Dec 18, 2024Updated last year
- A single notebook for fine-tuning GPT-3.5 turbo☆31Aug 16, 2024Updated last year
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆48Feb 27, 2025Updated last year
- A collection of text embedding experiments☆55Feb 27, 2023Updated 3 years ago
- ☆56Mar 7, 2023Updated 3 years ago
- ☆19Apr 4, 2023Updated 2 years ago
- Port of Facebook's LLaMA model in C/C++☆21Nov 6, 2023Updated 2 years ago
- Let's make sand talk☆588Oct 17, 2023Updated 2 years ago
- batched loras☆350Sep 6, 2023Updated 2 years ago
- ☆12Sep 26, 2023Updated 2 years ago
- An open-source non-official community implementation of the model from the paper: Surgical Robot Transformer (SRT): Imitation Learning fo…☆11Feb 9, 2026Updated last month
- ☆62Dec 8, 2023Updated 2 years ago
- [WIP] AI Try-On plugin for Chrome☆28Mar 16, 2024Updated last year
- LLM training in simple, raw C/CUDA☆18May 6, 2024Updated last year
- ☆40Mar 25, 2024Updated last year
- Full finetuning of large language models without large memory requirements☆94Sep 22, 2025Updated 5 months ago
- Chrome Extension for YouTube. Acts as an assistant for the YouTube video you are watching☆23Apr 26, 2023Updated 2 years ago
- Simplex Random Feature attention, in PyTorch☆76Oct 10, 2023Updated 2 years ago
- BH hackathon☆14Apr 4, 2024Updated last year
- A distributed execution framework built upon lunatic.☆16Jan 19, 2024Updated 2 years ago
- Seamless Voice Interactions with LLMs☆12Oct 28, 2023Updated 2 years ago
- Guess the Hacker News titles☆12Mar 24, 2022Updated 3 years ago
- extending laughbot project to encoder-based transformer model finetuned on same dataset for humor classification☆10Jan 4, 2023Updated 3 years ago
- ☆11Dec 11, 2024Updated last year
- ☆10Jul 17, 2023Updated 2 years ago
- ☆45Oct 13, 2023Updated 2 years ago
- Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".☆281Nov 3, 2023Updated 2 years ago
- inference code for mixtral-8x7b-32kseqlen☆104Dec 12, 2023Updated 2 years ago
- ☆3,371Feb 25, 2024Updated 2 years ago
- ☆27Mar 14, 2024Updated last year
- LLM plugin for models hosted by Anyscale Endpoints☆35Apr 22, 2024Updated last year
- GGML implementation of BERT model with Python bindings and quantization.☆57Feb 19, 2024Updated 2 years ago
- Skybox previewer and generator using BlockadeLabs☆15May 13, 2023Updated 2 years ago
- The Codec 2 speech codec, compiled to WASM using Emscripten.☆13Apr 27, 2023Updated 2 years ago
- ☆10Oct 24, 2024Updated last year
- Implementation of the paper: "Audio Mamba: Bidirectional State Space Model for Audio Representation Learning" in pytorch