toverainc / ai-routerLinks
AI Router
☆13Updated last year
Alternatives and similar repositories for ai-router
Users that are interested in ai-router are comparing it to the libraries listed below
Sorting:
- GGML implementation of BERT model with Python bindings and quantization.☆56Updated last year
- LLaVA server (llama.cpp).☆183Updated 2 years ago
- Smart proxy for LLM APIs that enables model-specific parameter control, automatic mode switching (like Qwen3's /think and /no_think), and…☆51Updated 5 months ago
- Port of Suno AI's Bark in C/C++ for fast inference☆52Updated last year
- Easily convert HuggingFace models to GGUF-format for llama.cpp☆23Updated last year
- Trying to deconstruct RWKV in understandable terms☆14Updated 2 years ago
- Automated LLM Coding Tournaments. There can be only one (winning code solution from the competing AIs)☆25Updated 7 months ago
- ☆102Updated last year
- GPU-targeted vendor-agnostic AI library for Windows, and Mistral model implementation.☆58Updated last year
- Official implementation of "WhisperNER: Unified Open Named Entity and Speech Recognition"☆196Updated 8 months ago
- Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder☆44Updated last year
- Create 3D files in the CLI with Small Language Model☆41Updated 3 weeks ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆97Updated 5 months ago
- Port of Microsoft's BioGPT in C/C++ using ggml☆85Updated last year
- An unsupervised model merging algorithm for Transformers-based language models.☆106Updated last year
- GPT-4 Level Conversational QA Trained In a Few Hours☆65Updated last year
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆98Updated 4 months ago
- A fast RWKV Tokenizer written in Rust☆54Updated 2 months ago
- Pivotal Token Search☆131Updated 3 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆22Updated last year
- LLM based agents with proactive interactions, long-term memory, external tool integration, and local deployment capabilities.☆105Updated 3 months ago
- ☆49Updated 9 months ago
- LLM inference in C/C++☆103Updated last week
- xllamacpp - a Python wrapper of llama.cpp☆62Updated last week
- Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.☆74Updated 9 months ago
- Let's create synthetic textbooks together :)☆75Updated last year
- The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.☆13Updated last year
- Preprint: Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆28Updated last year
- ☆31Updated last year
- Demo python script app to interact with llama.cpp server using whisper API, microphone and webcam devices.☆45Updated 2 years ago