janhq / cortex.llamacpp
cortex.llamacpp is a high-efficiency C++ inference engine for edge computing. It is a dynamic library that can be loaded by any server at runtime.
☆36Updated this week
Alternatives and similar repositories for cortex.llamacpp:
Users that are interested in cortex.llamacpp are comparing it to the libraries listed below
- Yet Another (LLM) Web UI, made with Gemini☆11Updated 2 months ago
- Experiments with BitNet inference on CPU☆53Updated 11 months ago
- A chat UI for Llama.cpp☆12Updated 3 weeks ago
- ☆31Updated last year
- GGML implementation of BERT model with Python bindings and quantization.☆54Updated last year
- Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU a…☆42Updated 5 months ago
- Port of Suno AI's Bark in C/C++ for fast inference☆53Updated 10 months ago
- AirLLM 70B inference with single 4GB GPU☆12Updated 7 months ago
- After my server ui improvements were successfully merged, consider this repo a playground for experimenting, tinkering and hacking around…☆56Updated 6 months ago
- TTS support with GGML☆19Updated 2 weeks ago
- Train your own small bitnet model☆65Updated 4 months ago
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆56Updated 3 months ago
- A Javascript library (with Typescript types) to parse metadata of GGML based GGUF files.☆46Updated 7 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆31Updated 7 months ago
- A minimalistic C++ Jinja templating engine for LLM chat templates☆123Updated this week
- llama.cpp fork used by GPT4All☆52Updated 2 weeks ago
- run ollama & gguf easily with a single command☆49Updated 9 months ago
- ☆30Updated 9 months ago
- Accepts a Hugging Face model URL, automatically downloads and quantizes it using Bits and Bytes.☆38Updated 11 months ago
- Testing LLM reasoning abilities with family relationship quizzes.☆60Updated last month
- ggml implementation of BERT Embedding☆25Updated last year
- Course Project for COMP4471 on RWKV☆17Updated last year
- Easily convert HuggingFace models to GGUF-format for llama.cpp☆21Updated 7 months ago
- Simple LLM inference server☆20Updated 8 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆53Updated 2 weeks ago
- entropix style sampling + GUI☆25Updated 4 months ago
- Convert your PDFs into audiobooks effortlessly. Features intelligent text extraction, customizable text-to-speech settings, and efficient…☆49Updated this week