LLM-powered lossless compression tool
☆305Jan 2, 2026Updated 2 months ago
Alternatives and similar repositories for llama-zip
Users that are interested in llama-zip are comparing it to the libraries listed below
Sorting:
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆41May 24, 2024Updated last year
- A live multiplayer trivia game where users can bid for the subject of the next question☆29Jan 9, 2026Updated last month
- Spotlight-like client for Ollama on Windows.☆28May 18, 2024Updated last year
- Inference of Mamba and Mamba2 models in pure C☆197Jan 22, 2026Updated last month
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.☆631Oct 29, 2024Updated last year
- This project showcases engaging interactions between two AI chatbots.☆10Jan 10, 2024Updated 2 years ago
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆56Feb 10, 2025Updated last year
- Create text chunks which end at natural stopping points without using a tokenizer☆26Nov 26, 2025Updated 3 months ago
- Simple Summarizer Tool using Llama 3 8b.☆10May 14, 2024Updated last year
- asynchronous/distributed speculative evaluation for llama3☆40Aug 8, 2024Updated last year
- JacQues is a Dash-based interactive web application that facilitates real-time chat and document management.☆22Jan 5, 2026Updated 2 months ago
- V.I.S.O.R., my in-development AI-powered voice assistant with integrated memory!☆36Nov 20, 2025Updated 3 months ago
- ☆21Jan 25, 2025Updated last year
- an auto-sleeping and -waking framework around llama.cpp☆12Feb 8, 2025Updated last year
- Experience the power of AI with this free AI voice generator demo. Utilizing Deepgram and Groq, we transform text into voice seamlessly. …☆37Jun 12, 2024Updated last year
- LLM inference in C/C++☆23Oct 4, 2024Updated last year
- Serving LLMs in the HF-Transformers format via a PyFlask API☆72Sep 10, 2024Updated last year
- Testing LLM reasoning abilities with family relationship quizzes.☆63Jan 28, 2025Updated last year
- A quick and optimized solution to manage llama based gguf quantized models, download gguf files, retreive messege formatting, add more mo…☆12Jan 13, 2024Updated 2 years ago
- Dataset Crafting w/ RAG/Wikipedia ground truth and Efficient Fine-Tuning Using MLX and Unsloth. Includes configurable dataset annotation …☆195Jul 21, 2024Updated last year
- Web UI for ExLlamaV2☆512Feb 5, 2025Updated last year
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆86Sep 22, 2024Updated last year
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆178Apr 19, 2024Updated last year
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inference☆1,003Dec 17, 2025Updated 2 months ago
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆12Jan 29, 2024Updated 2 years ago
- A simple library for working with Hugging Face models.☆14Dec 30, 2024Updated last year
- Large-scale LLM inference engine☆1,658Feb 17, 2026Updated 2 weeks ago
- Python package wrapping llama.cpp for on-device LLM inference☆101Oct 12, 2025Updated 4 months ago
- INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model☆1,562Mar 23, 2025Updated 11 months ago
- A Javascript library (with Typescript types) to parse metadata of GGML based GGUF files.☆51Jul 30, 2024Updated last year
- Extend the Conditioning of Stable Diffusion to take Audio Embeddings Instead of Text Embeddings using Wav2Vec2-BERT model☆13Sep 25, 2024Updated last year
- Like grep but for natural language questions. Based on Mistral 7B or Mixtral 8x7B.☆387Mar 13, 2024Updated last year
- ☆210Jan 5, 2026Updated 2 months ago
- Create Custom LLMs☆1,810Nov 8, 2025Updated 3 months ago
- Gradio based tool to run opensource LLM models directly from Huggingface☆97Jun 27, 2024Updated last year
- Open source alternative to Perplexity AI with ability to run locally☆227Oct 9, 2024Updated last year
- A pure and fast NumPy implementation of Mamba with cache support.☆18Jun 16, 2024Updated last year
- Update your Ollama models to their latest versions with Bun!☆20Oct 22, 2024Updated last year
- convert a saved pytorch model to gguf and generate as much corresponding ggml c code as possible☆15Dec 19, 2023Updated 2 years ago