menloresearch / cortex.llamacppLinks
cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server at runtime.
☆42Updated 2 months ago
Alternatives and similar repositories for cortex.llamacpp
Users that are interested in cortex.llamacpp are comparing it to the libraries listed below
Sorting:
- Lightweight C inference for Qwen3 GGUF. Multiturn prefix caching & batch processing.☆18Updated 2 weeks ago
- Yet Another (LLM) Web UI, made with Gemini☆12Updated 8 months ago
- instinct.cpp provides ready to use alternatives to OpenAI Assistant API and built-in utilities for developing AI Agent applications (RAG,…☆53Updated last year
- Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU a…☆42Updated 11 months ago
- AirLLM 70B inference with single 4GB GPU☆14Updated 2 months ago
- Thin wrapper around GGML to make life easier☆40Updated 2 months ago
- ☆24Updated 7 months ago
- Running Microsoft's BitNet via Electron, React & Astro☆44Updated 3 months ago
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆57Updated 9 months ago
- TTS support with GGML☆176Updated 3 weeks ago
- An OpenVoice-based voice cloning tool, single executable file (~14M), supporting multiple formats without dependencies on ffmpeg, Python,…☆33Updated 3 weeks ago
- Spotlight-like client for Ollama on Windows.☆28Updated last year
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?☆22Updated last year
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆16Updated last year
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆82Updated this week
- ☆23Updated 7 months ago
- SPLAA is an AI assistant framework that utilizes voice recognition, text-to-speech, and tool-calling capabilities to provide a conversati…☆28Updated 4 months ago
- A chat UI for Llama.cpp☆15Updated last week
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆39Updated last year
- Load and run Llama from safetensors files in C☆11Updated 10 months ago
- Something similar to Apple Intelligence?☆61Updated last year
- a lightweight, open-source blueprint for building powerful and scalable LLM chat applications☆28Updated last year
- Browser extension that lets you summarize and chat with any webpage using a local LLM of your choice.☆22Updated 10 months ago
- Yet another frontend for LLM, written using .NET and WinUI 3☆10Updated this week
- MilimoChat: Privacy-first, self-hosted AI chat with customizable personas, context-aware memory, and local analytics. Built on Python/Str…☆14Updated 6 months ago
- Locally running LLM with internet access☆96Updated 2 months ago
- Experiments with BitNet inference on CPU☆54Updated last year
- LLM Ripper is a framework for component extraction (embeddings, attention heads, FFNs), activation capture, functional analysis, and adap…☆44Updated this week
- Generate a llama-quantize command to copy the quantization parameters of any GGUF☆24Updated last month
- LlamaCards is a web application that provides a dynamic interface for interacting with LLM models in real-time. This app allows users to …☆39Updated last year