janhq / cortex.llamacpp
cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server at runtime.
☆34Updated this week
Alternatives and similar repositories for cortex.llamacpp:
Users that are interested in cortex.llamacpp are comparing it to the libraries listed below
- Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU a…☆42Updated 4 months ago
- A chat UI for Llama.cpp☆12Updated this week
- Easy to use, High Performant Knowledge Distillation for LLMs☆40Updated 3 weeks ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆31Updated 6 months ago
- G2P☆35Updated this week
- Yet Another (LLM) Web UI, made with Gemini☆11Updated last month
- Convert your PDFs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and efficient…☆37Updated 2 weeks ago
- Port of Suno AI's Bark in C/C++ for fast inference☆54Updated 9 months ago
- stable-diffusion.cpp bindings for python☆34Updated last week
- Course Project for COMP4471 on RWKV☆17Updated 11 months ago
- Large Model Proxy is designed to make it easy to run multiple resource-heavy Large Models (LM) on the same machine with limited amount of…☆49Updated 3 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆53Updated 11 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆17Updated 3 months ago
- Experiments with BitNet inference on CPU☆52Updated 9 months ago
- General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). …☆44Updated 4 months ago
- AirLLM 70B inference with single 4GB GPU☆12Updated 5 months ago
- ☆21Updated 5 months ago
- Easily convert HuggingFace models to GGUF-format for llama.cpp☆21Updated 6 months ago
- This is the Mixture-of-Agents (MoA) concept, adapted from the original work by TogetherAI. My version is tailored for local model usage a…☆108Updated 7 months ago
- Nomic Vulkan Fork of LLaMa.cpp☆51Updated this week
- ☆31Updated last year
- entropix style sampling + GUI☆25Updated 3 months ago
- Video+code lecture on building nanoGPT from scratch☆65Updated 7 months ago
- After my server ui improvements were successfully merged, consider this repo a playground for experimenting, tinkering and hacking around…☆56Updated 5 months ago
- ☆109Updated last month
- B-Llama3o a llama3 with Vision Audio and Audio understanding as well as text and Audio and Animation Data output.☆26Updated 7 months ago
- SPLAA is an AI assistant framework that utilizes voice recognition, text-to-speech, and tool-calling capabilities to provide a conversati…☆23Updated 3 weeks ago
- llama.cpp fork with additional SOTA quants and improved performance☆133Updated this week
- Locally running LLM with internet access☆93Updated 3 months ago
- A Model Context Protocol server for searching and analyzing arXiv papers☆52Updated this week