LLM-powered lossless compression tool
☆305Jan 2, 2026Updated 2 months ago
Alternatives and similar repositories for llama-zip
Users that are interested in llama-zip are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- asynchronous/distributed speculative evaluation for llama3☆39Aug 8, 2024Updated last year
- Spotlight-like client for Ollama on Windows.☆28May 18, 2024Updated last year
- A live multiplayer trivia game where users can bid for the subject of the next question☆29Jan 9, 2026Updated 2 months ago
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.☆633Oct 29, 2024Updated last year
- Inference of Mamba, Mamba2 and Mamba3 models in pure C☆199Mar 18, 2026Updated last week
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆41May 24, 2024Updated last year
- V.I.S.O.R., my in-development AI-powered voice assistant with integrated memory!☆36Nov 20, 2025Updated 4 months ago
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆56Feb 10, 2025Updated last year
- Update your Ollama models to their latest versions with Bun!☆20Oct 22, 2024Updated last year
- JacQues is a Dash-based interactive web application that facilitates real-time chat and document management.☆22Jan 5, 2026Updated 2 months ago
- Testing LLM reasoning abilities with family relationship quizzes.☆63Jan 28, 2025Updated last year
- Create text chunks which end at natural stopping points without using a tokenizer☆26Nov 26, 2025Updated 3 months ago
- Simple Summarizer Tool using Llama 3 8b.☆10May 14, 2024Updated last year
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆86Sep 22, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- a lightweight, open-source blueprint for building powerful and scalable LLM chat applications☆28Jun 7, 2024Updated last year
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆181Apr 19, 2024Updated last year
- an auto-sleeping and -waking framework around llama.cpp☆12Feb 8, 2025Updated last year
- Experience the power of AI with this free AI voice generator demo. Utilizing Deepgram and Groq, we transform text into voice seamlessly. …☆37Jun 12, 2024Updated last year
- Something similar to Apple Intelligence?☆60Jul 3, 2024Updated last year
- ☆210Jan 5, 2026Updated 2 months ago
- A Javascript library (with Typescript types) to parse metadata of GGML based GGUF files.☆52Jul 30, 2024Updated last year
- INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model☆1,562Mar 23, 2025Updated last year
- Simple Tool Caller for llama.cpp☆11Aug 12, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Download full or partial git-lfs repos without temporarily using 2x disk space☆31Oct 13, 2023Updated 2 years ago
- This project showcases engaging interactions between two AI chatbots.☆10Jan 10, 2024Updated 2 years ago
- Serving LLMs in the HF-Transformers format via a PyFlask API☆72Sep 10, 2024Updated last year
- Large-scale LLM inference engine☆1,681Mar 12, 2026Updated last week
- Dataset Crafting w/ RAG/Wikipedia ground truth and Efficient Fine-Tuning Using MLX and Unsloth. Includes configurable dataset annotation …☆195Jul 21, 2024Updated last year
- Open source alternative to Perplexity AI with ability to run locally☆229Oct 9, 2024Updated last year
- Create Custom LLMs☆1,820Nov 8, 2025Updated 4 months ago
- Y'all thought the dead internet theory wasn't real, but HERE IT IS☆209Apr 27, 2024Updated last year
- Python package wrapping llama.cpp for on-device LLM inference☆101Oct 12, 2025Updated 5 months ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Gradio based tool to run opensource LLM models directly from Huggingface☆97Jun 27, 2024Updated last year
- Web UI for ExLlamaV2☆510Feb 5, 2025Updated last year
- An implementation of bucketMul LLM inference☆227Jul 1, 2024Updated last year
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inference☆1,016Dec 17, 2025Updated 3 months ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆155Oct 15, 2024Updated last year
- The application performs real-time inference on audio from an ALSA capture device☆38Jun 19, 2025Updated 9 months ago
- Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.☆2,865Feb 10, 2026Updated last month