menloresearch / cortex.llamacppLinks
cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server at runtime.
☆40Updated this week
Alternatives and similar repositories for cortex.llamacpp
Users that are interested in cortex.llamacpp are comparing it to the libraries listed below
Sorting:
- Yet Another (LLM) Web UI, made with Gemini☆12Updated 5 months ago
- Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU a…☆43Updated 8 months ago
- Thin wrapper around GGML to make life easier☆34Updated this week
- ☆21Updated 4 months ago
- ☆24Updated 4 months ago
- A chat UI for Llama.cpp☆13Updated this week
- GGML implementation of BERT model with Python bindings and quantization.☆55Updated last year
- ☆29Updated last month
- Local LLM inference & management server with built-in OpenAI API☆31Updated last year
- AirLLM 70B inference with single 4GB GPU☆13Updated 9 months ago
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆15Updated 9 months ago
- Running Microsoft's BitNet via Electron, React & Astro☆39Updated this week
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆56Updated 6 months ago
- ☆31Updated last year
- Port of Suno AI's Bark in C/C++ for fast inference☆52Updated last year
- deep hermes, but decides how to respond based on its OWN decision, no need for system prompts.☆36Updated 2 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆34Updated 10 months ago
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆52Updated 5 months ago
- ☆58Updated 5 months ago
- Super simple python connectors for llama.cpp, including vision models (Gemma 3, Qwen2-VL). Compile llama.cpp and run!☆24Updated last month
- llama.cpp fork used by GPT4All☆55Updated 3 months ago
- Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port…☆21Updated 9 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆65Updated this week
- JotItNow is a AI Voice Notes App☆15Updated 3 months ago
- ggml implementation of BERT Embedding☆25Updated last year
- Controllable Language Model Interactions in TypeScript☆9Updated last year
- LLM inference in C/C++☆21Updated 2 months ago
- Ask shortgpt for instant and concise answers☆13Updated 2 years ago
- An interface that features barely zero external dependencies beyond the Ollama API itself, making it lightweight and portable to easily i…☆12Updated 2 months ago
- CI for ggml and related projects☆29Updated this week