menloresearch / cortex.llamacppLinks
cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server at runtime.
☆42Updated last week
Alternatives and similar repositories for cortex.llamacpp
Users that are interested in cortex.llamacpp are comparing it to the libraries listed below
Sorting:
- Yet Another (LLM) Web UI, made with Gemini☆12Updated 6 months ago
- Thin wrapper around GGML to make life easier☆36Updated 3 weeks ago
- A chat UI for Llama.cpp☆15Updated last week
- Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU a…☆43Updated 9 months ago
- instinct.cpp provides ready to use alternatives to OpenAI Assistant API and built-in utilities for developing AI Agent applications (RAG,…☆51Updated last year
- GGML implementation of BERT model with Python bindings and quantization.☆55Updated last year
- Experiments with BitNet inference on CPU☆54Updated last year
- Local LLM inference & management server with built-in OpenAI API☆31Updated last year
- ☆21Updated 5 months ago
- A Javascript library (with Typescript types) to parse metadata of GGML based GGUF files.☆47Updated 11 months ago
- A ggml (C++) re-implementation of tortoise-tts☆188Updated 10 months ago
- TTS support with GGML☆127Updated 2 weeks ago
- a lightweight, open-source blueprint for building powerful and scalable LLM chat applications☆28Updated last year
- Course Project for COMP4471 on RWKV☆17Updated last year
- ☆31Updated last year
- Port of Suno AI's Bark in C/C++ for fast inference☆52Updated last year
- AirLLM 70B inference with single 4GB GPU☆14Updated 2 weeks ago
- Running Microsoft's BitNet via Electron, React & Astro☆43Updated last month
- ☆146Updated last year
- Simple agent framework using Ollama tool calling☆10Updated 10 months ago
- RetroChat is a powerful command-line interface for interacting with various AI language models. It provides a seamless experience for eng…☆76Updated last month
- LlamaCards is a web application that provides a dynamic interface for interacting with LLM models in real-time. This app allows users to …☆39Updated 10 months ago
- Train your own small bitnet model☆74Updated 8 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆36Updated 11 months ago
- Locally running LLM with internet access☆95Updated 2 weeks ago
- ☆24Updated 5 months ago
- fast state-of-the-art speech models and a runtime that runs anywhere 💥☆55Updated last month
- Super simple python connectors for llama.cpp, including vision models (Gemma 3, Qwen2-VL). Compile llama.cpp and run!☆25Updated 2 months ago
- ☆17Updated last week
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆20Updated 9 months ago