janhq / cortex.llamacppLinks
cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server at runtime.
☆42Updated 7 months ago
Alternatives and similar repositories for cortex.llamacpp
Users that are interested in cortex.llamacpp are comparing it to the libraries listed below
Sorting:
- Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU a…☆42Updated last year
- Lightweight C inference for Qwen3 GGUF. Multiturn prefix caching & batch processing.☆21Updated 5 months ago
- ☆29Updated last year
- Running Microsoft's BitNet via Electron, React & Astro☆52Updated 4 months ago
- TTS support with GGML☆218Updated 4 months ago
- Controllable Language Model Interactions in TypeScript☆10Updated last year
- Thin wrapper around GGML to make life easier☆42Updated 3 months ago
- Locally running LLM with internet access☆97Updated 7 months ago
- ☆24Updated last year
- GGML implementation of BERT model with Python bindings and quantization.☆58Updated last year
- instinct.cpp provides ready to use alternatives to OpenAI Assistant API and built-in utilities for developing AI Agent applications (RAG,…☆57Updated last year
- AirLLM 70B inference with single 4GB GPU☆17Updated 7 months ago
- Experiments with BitNet inference on CPU☆55Updated last year
- A Javascript library (with Typescript types) to parse metadata of GGML based GGUF files.☆51Updated last year
- Port of Suno AI's Bark in C/C++ for fast inference☆54Updated last year
- Local11Labs allows generating high-quality text-to-speech and podcast content using the fast and tiny Kokoro-82M.☆49Updated last year
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆59Updated last year
- SPLAA is an AI assistant framework that utilizes voice recognition, text-to-speech, and tool-calling capabilities to provide a conversati…☆28Updated 9 months ago
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆17Updated last year
- Yet Another (LLM) Web UI, made with Gemini☆12Updated last year
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆41Updated last year
- Inference of Large Multimodal Models in C/C++. LLaVA and others☆48Updated 2 years ago
- fast state-of-the-art speech models and a runtime that runs anywhere 💥☆57Updated 7 months ago
- A ggml (C++) re-implementation of tortoise-tts☆193Updated last year
- Yet another frontend for LLM, written using .NET and WinUI 3☆10Updated 4 months ago
- MilimoChat: Privacy-first, self-hosted AI chat with customizable personas, context-aware memory, and local analytics. Built on Python/Str…☆14Updated 10 months ago
- Cleanai (https://github.com/willmil11/cleanai) except I'm making it in c now. Fast and clean from the start this time :)☆17Updated last month
- Course Project for COMP4471 on RWKV☆17Updated last year
- Load and run Llama from safetensors files in C☆15Updated last year
- A chat UI for Llama.cpp☆15Updated 2 months ago