3x Faster Inference; Unofficial implementation of EAGLE Speculative Decoding
☆83Jul 3, 2025Updated 8 months ago
Alternatives and similar repositories for BaldEagle
Users that are interested in BaldEagle are comparing it to the libraries listed below
Sorting:
- Official Implementation of "Learning Harmonized Representations for Speculative Sampling" (HASS)☆54Mar 14, 2025Updated 11 months ago
- Fast and memory-efficient exact attention☆18Updated this week
- Voice Cloning, Now Inside Kokoro. Generate natural multilingual speech and clone any target voice with ease.☆47Updated this week
- an auto-sleeping and -waking framework around llama.cpp☆12Feb 8, 2025Updated last year
- flow-merge is a powerful Python library that enables seamless merging of multiple transformer-based language models using the most popula…☆20Feb 12, 2025Updated last year
- win32 native frontend for llama-cli☆12Nov 2, 2024Updated last year
- ScribePal is an Open Source intelligent browser extension that leverages AI to empower your web experience by providing contextual insigh…☆22Updated this week
- Mic-controlled mouse clicks☆17Oct 6, 2025Updated 5 months ago
- A powerful system for crawling documentation websites, extracting code snippets, and providing fast search capabilities via MCP (Model C…☆27Dec 25, 2025Updated 2 months ago
- A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM☆262Updated this week
- Chatbot-to-speech using Orpheus TTS model. Interactive console app.☆21May 1, 2025Updated 10 months ago
- TLS & API keys for your LLM APIs☆20Dec 17, 2025Updated 2 months ago
- ☆17Dec 16, 2024Updated last year
- Personal voice assistant, with voice interruption and Twilio support☆18Feb 24, 2025Updated last year
- Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.☆49Oct 29, 2025Updated 4 months ago
- Distributed Reinforcement Learning for LLM Fine-Tuning with multi-GPU utilization☆22Mar 12, 2025Updated 11 months ago
- private-machine is an AI companion system with emotion, needs and goals simulation. Very silly, not based on real science.☆30Feb 26, 2026Updated last week
- Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models☆262Apr 23, 2024Updated last year
- Make Qwen3 Think like Gemini 2.5 Pro | Open webui function☆25May 10, 2025Updated 9 months ago
- A modern, single-page web chat interface for local LLMs (Large Language Models), inspired by the visual style and UX of Anthropic's Claud…☆29May 11, 2025Updated 9 months ago
- ☆41May 27, 2025Updated 9 months ago
- QQQ is an innovative and hardware-optimized W4A8 quantization solution for LLMs.☆154Aug 21, 2025Updated 6 months ago
- Common AI Agent written with Go. Supports MCP, RAG, A2A, AI Memory☆40Feb 9, 2026Updated 3 weeks ago
- B-Llama3o a llama3 with Vision Audio and Audio understanding as well as text and Audio and Animation Data output.☆26Jun 3, 2024Updated last year
- ☆32Jan 1, 2024Updated 2 years ago
- Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"☆30Jun 30, 2025Updated 8 months ago
- Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)☆369Apr 22, 2025Updated 10 months ago
- llmbasedos — Local-First OS Where Your AI Agents Wake Up and Work☆283Jan 6, 2026Updated 2 months ago
- A simple no-install web UI for Ollama and OAI-Compatible APIs!☆31Jan 30, 2025Updated last year
- Local drive deep search.☆33Jun 4, 2025Updated 9 months ago
- The DPAB-α Benchmark☆32Jan 15, 2025Updated last year
- Open WebUI tool — Give your LLM a persistent workspace with file storage, SQLite, archives, and collaboration.☆76Feb 2, 2026Updated last month
- Train speculative decoding models effortlessly and port them smoothly to SGLang serving.☆716Feb 28, 2026Updated last week
- A QT GUI for large language models☆40Dec 27, 2023Updated 2 years ago
- Official PyTorch implementation of "GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance" (ICML 2025)☆50Jul 6, 2025Updated 8 months ago
- Oak National Academy's AI Auto Eval tools provide LLM as a judge evaluation on lesson plans and resources☆17Nov 4, 2025Updated 4 months ago
- Model and application for deepfake detection using a hybrid approach (spatial + frequency-based)☆48Jan 8, 2026Updated 2 months ago
- Easily view and modify JSON datasets for large language models☆87May 16, 2025Updated 9 months ago
- Run Orpheus 3B Locally With LM Studio☆32Mar 20, 2025Updated 11 months ago