janhq / cortex.llamacppLinks
cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server at runtime.
☆41Updated 5 months ago
Alternatives and similar repositories for cortex.llamacpp
Users that are interested in cortex.llamacpp are comparing it to the libraries listed below
Sorting:
- Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU a…☆42Updated last year
- ☆26Updated 10 months ago
- Port of Suno AI's Bark in C/C++ for fast inference☆53Updated last year
- AirLLM 70B inference with single 4GB GPU☆14Updated 5 months ago
- instinct.cpp provides ready to use alternatives to OpenAI Assistant API and built-in utilities for developing AI Agent applications (RAG,…☆54Updated last year
- TTS support with GGML☆201Updated 2 months ago
- Experiments with BitNet inference on CPU☆54Updated last year
- Yet Another (LLM) Web UI, made with Gemini☆12Updated 11 months ago
- Lightweight C inference for Qwen3 GGUF. Multiturn prefix caching & batch processing.☆19Updated 3 months ago
- Running Microsoft's BitNet via Electron, React & Astro☆48Updated 2 months ago
- Babylon.cpp is a C and C++ library for grapheme to phoneme conversion and text to speech synthesis. For phonemization a ONNX runtime port…☆27Updated 3 months ago
- ☆108Updated 4 months ago
- On-device streaming text-to-speech engine powered by deep learning☆121Updated last week
- Thin wrapper around GGML to make life easier☆40Updated last month
- GGML implementation of BERT model with Python bindings and quantization.☆58Updated last year
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆85Updated this week
- Inference of Large Multimodal Models in C/C++. LLaVA and others☆48Updated 2 years ago
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆59Updated last year
- Train your own small bitnet model☆75Updated last year
- A ggml (C++) re-implementation of tortoise-tts☆193Updated last year
- Locally running LLM with internet access☆97Updated 5 months ago
- llama.cpp fork used by GPT4All☆55Updated 10 months ago
- unsloth-5090-multiple☆60Updated 7 months ago
- SPLAA is an AI assistant framework that utilizes voice recognition, text-to-speech, and tool-calling capabilities to provide a conversati…☆28Updated 7 months ago
- A chat UI for Llama.cpp☆15Updated 2 weeks ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆41Updated last year
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆23Updated last year
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆103Updated 11 months ago
- Port of Microsoft's BioGPT in C/C++ using ggml☆85Updated last year
- Fork of llama.cpp, extended for GPT-NeoX, RWKV-v4, and Falcon models☆28Updated 2 years ago