menloresearch / cortex.llamacpp
cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server at runtime.
☆38Updated this week
Alternatives and similar repositories for cortex.llamacpp:
Users that are interested in cortex.llamacpp are comparing it to the libraries listed below
- A chat UI for Llama.cpp☆13Updated 2 weeks ago
- Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU a…☆43Updated 7 months ago
- Yet Another (LLM) Web UI, made with Gemini☆11Updated 4 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆56Updated 2 months ago
- A simple library for working with Hugging Face models.☆14Updated 3 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆33Updated 9 months ago
- Experiments with BitNet inference on CPU☆53Updated last year
- ☆31Updated last year
- Something similar to Apple Intelligence?☆60Updated 9 months ago
- llama.cpp fork used by GPT4All☆55Updated 2 months ago
- Easily convert HuggingFace models to GGUF-format for llama.cpp☆21Updated 8 months ago
- idea: https://github.com/nyxkrage/ebook-groupchat/☆86Updated 8 months ago
- A Javascript library (with Typescript types) to parse metadata of GGML based GGUF files.☆47Updated 8 months ago
- Local LLM inference & management server with built-in OpenAI API☆31Updated last year
- ☆24Updated 3 months ago
- ☆22Updated this week
- GGML implementation of BERT model with Python bindings and quantization.☆56Updated last year
- Course Project for COMP4471 on RWKV☆17Updated last year
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆16Updated 7 months ago
- Running Microsoft's BitNet via Electron, React & Astro☆30Updated last week
- ☆100Updated 7 months ago
- ☆19Updated 11 months ago
- ggml implementation of embedding models including SentenceTransformer and BGE☆56Updated last year
- Spotlight-like client for Ollama on Windows.☆27Updated 11 months ago
- ☆17Updated 2 months ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated this week
- ☆54Updated 8 months ago
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆56Updated 4 months ago
- ☆19Updated last month
- Port of Suno AI's Bark in C/C++ for fast inference☆52Updated last year