TabbyML / registry-tabbyLinks

☆34

Alternatives and similar repositories for registry-tabby

Users that are interested in registry-tabby are comparing it to the libraries listed below

Sorting:

Contextualist / lone-arena
Self-hosted LLM chatbot arena, with yourself as the only judge
☆41Updated last year
sourcegraph / jetbrains
☆90Updated last month
astramind-ai / Pulsar
The hearth of The Pulsar App, fast, secure and shared inference with modern UI
☆56Updated 6 months ago
PerminovEugene / messy-folder-reorganizer-ai
🤖 AI-powered CLI for file reorganization. Runs fully locally — no data leaves your machine.
☆16Updated 2 months ago
iamlemec / bert.cpp
GGML implementation of BERT model with Python bindings and quantization.
☆55Updated last year
gpustack / llama-box
LM inference server implementation based on *.cpp.
☆226Updated this week
gpustack / gguf-parser-go
Review/Check GGUF files and estimate the memory usage and maximum tokens per second.
☆177Updated last week
justinchuby / onnx-safetensors
Use safetensors with ONNX 🤗
☆63Updated 3 months ago
yijunyu / llm.rs
LLM training in simple, raw C/CUDA, migrated into Rust
☆46Updated 3 months ago
wavify-labs / wavify-sdks
fast state-of-the-art speech models and a runtime that runs anywhere 💥
☆55Updated 2 weeks ago
lmstudio-ai / venvstacks
Virtual environment stacks for Python
☆258Updated last week
huggingface / llm-intellij
LLM powered development for IntelliJ
☆81Updated last year
monatis / lmm.cpp
Inference of Large Multimodal Models in C/C++. LLaVA and others
☆47Updated last year
Leikoe / torch_to_ggml
convert a saved pytorch model to gguf and generate as much corresponding ggml c code as possible
☆14Updated last year
EricLBuehler / diffusion-rs
Blazingly fast inference of diffusion models.
☆108Updated 2 months ago
mounta11n / Pacha
"Pacha" TUI (Text User Interface) is a JavaScript application that utilizes the "blessed" library. It serves as a frontend for llama.cpp …
☆36Updated last year
the-crypt-keeper / ggml-downloader
Simple, Fast, Parallel Huggingface GGML model downloader written in python
☆24Updated last year
iohub / collama
VSCode AI coding assistant powered by self-hosted llama.cpp endpoint.
☆183Updated 4 months ago
Npahlfer / ooo
A CLI for piping outputs to ollama or just prompting
☆54Updated 11 months ago
lucasjinreal / Crane
A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.
☆126Updated last week
kemeny / PongGame
A game of pong made by MetaGPT and ChatGPT Code Interpreter
☆14Updated last year
latent-variable / r1_reasoning_effort
Forces DeepSeek R1 models to engage in extended reasoning by intercepting early termination tokens.
☆19Updated 4 months ago
nomic-ai / kompute
General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). …
☆49Updated 4 months ago
furiousteabag / vram-calculator
Transformer GPU VRAM estimator
☆65Updated last year
nexusflowai / NexusBench
Nexusflow function call, tool use, and agent benchmarks.
☆20Updated 6 months ago
gvlassis / gvtop
🎮 Material You TUI for monitoring NVIDIA GPUs
☆50Updated 3 weeks ago
mzbac / wizardCoder-vsc
Visual Studio Code extension for WizardCoder
☆148Updated last year
lubosz / python-sixel
Display images in the terminal
☆17Updated last year
offline-ai / cli
The AI agent script CLI for Programmable Prompt Engine.
☆59Updated 2 months ago
bjj / exllamav2-openai-server
An OpenAI API compatible LLM inference server based on ExLlamaV2.
☆25Updated last year