LLM-powered lossless compression tool
☆307Jan 2, 2026Updated 3 months ago
Alternatives and similar repositories for llama-zip
Users that are interested in llama-zip are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A collection of experiments related to LLM inference with llama.cpp/mlx☆40Updated this week
- Spotlight-like client for Ollama on Windows.☆28May 18, 2024Updated last year
- A live multiplayer trivia game where users can bid for the subject of the next question☆29Jan 9, 2026Updated 3 months ago
- An application for running LLMs locally on your device, with your documents, facilitating detailed citations in generated responses.☆633Oct 29, 2024Updated last year
- Inference of Mamba, Mamba2 and Mamba3 models in pure C☆199Mar 18, 2026Updated 3 weeks ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆41May 24, 2024Updated last year
- V.I.S.O.R., my in-development AI-powered voice assistant with integrated memory!☆36Nov 20, 2025Updated 4 months ago
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆57Feb 10, 2025Updated last year
- Update your Ollama models to their latest versions with Bun!☆20Oct 22, 2024Updated last year
- JacQues is a Dash-based interactive web application that facilitates real-time chat and document management.☆22Jan 5, 2026Updated 3 months ago
- Testing LLM reasoning abilities with family relationship quizzes.☆63Jan 28, 2025Updated last year
- Create text chunks which end at natural stopping points without using a tokenizer☆26Nov 26, 2025Updated 4 months ago
- Simple Summarizer Tool using Llama 3 8b.☆10May 14, 2024Updated last year
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆86Sep 22, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- a lightweight, open-source blueprint for building powerful and scalable LLM chat applications☆28Jun 7, 2024Updated last year
- Tiny ASIC implementation for "The Era of 1-bit LLMs All Large Language Models are in 1.58 Bits" matrix multiplication unit☆186Apr 19, 2024Updated last year
- Experience the power of AI with this free AI voice generator demo. Utilizing Deepgram and Groq, we transform text into voice seamlessly. …☆37Jun 12, 2024Updated last year
- Something similar to Apple Intelligence?☆60Jul 3, 2024Updated last year
- ☆212Jan 5, 2026Updated 3 months ago
- A Javascript library (with Typescript types) to parse metadata of GGML based GGUF files.☆52Jul 30, 2024Updated last year
- INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model☆1,567Mar 23, 2025Updated last year
- Simple Tool Caller for llama.cpp☆11Aug 12, 2024Updated last year
- Download full or partial git-lfs repos without temporarily using 2x disk space☆31Oct 13, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- This project showcases engaging interactions between two AI chatbots.☆10Jan 10, 2024Updated 2 years ago
- Serving LLMs in the HF-Transformers format via a PyFlask API☆72Sep 10, 2024Updated last year
- Large-scale LLM inference engine☆1,686Mar 12, 2026Updated last month
- Dataset Crafting w/ RAG/Wikipedia ground truth and Efficient Fine-Tuning Using MLX and Unsloth. Includes configurable dataset annotation …☆194Jul 21, 2024Updated last year
- Create Custom LLMs☆1,825Nov 8, 2025Updated 5 months ago
- Y'all thought the dead internet theory wasn't real, but HERE IT IS☆208Apr 27, 2024Updated last year
- Gradio based tool to run opensource LLM models directly from Huggingface☆97Jun 27, 2024Updated last year
- Web UI for ExLlamaV2☆511Feb 5, 2025Updated last year
- An implementation of bucketMul LLM inference☆227Jul 1, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- WebAssembly binding for llama.cpp - Enabling on-browser LLM inference☆1,029Dec 17, 2025Updated 3 months ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆155Oct 15, 2024Updated last year
- Distributed LLM inference. Connect home devices into a powerful cluster to accelerate LLM inference. More devices means faster inference.☆2,892Feb 10, 2026Updated 2 months ago
- ☆21Jan 25, 2025Updated last year
- Like grep but for natural language questions. Based on Mistral 7B or Mixtral 8x7B.☆386Mar 13, 2024Updated 2 years ago
- A fast inference library for running LLMs locally on modern consumer-class GPUs☆4,493Mar 4, 2026Updated last month
- Open-source LLM load balancer and serving platform for self-hosting LLMs at scale 🏓🦙 Alternative to projects like llm-d, Docker Model R…☆1,514Apr 3, 2026Updated last week