TabbyML / registry-tabbyLinks
☆34Updated last month
Alternatives and similar repositories for registry-tabby
Users that are interested in registry-tabby are comparing it to the libraries listed below
Sorting:
- Self-hosted LLM chatbot arena, with yourself as the only judge☆41Updated last year
- ☆90Updated last month
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆56Updated 6 months ago
- 🤖 AI-powered CLI for file reorganization. Runs fully locally — no data leaves your machine.☆16Updated 2 months ago
- GGML implementation of BERT model with Python bindings and quantization.☆55Updated last year
- LM inference server implementation based on *.cpp.☆226Updated this week
- Review/Check GGUF files and estimate the memory usage and maximum tokens per second.☆177Updated last week
- Use safetensors with ONNX 🤗☆63Updated 3 months ago
- LLM training in simple, raw C/CUDA, migrated into Rust☆46Updated 3 months ago
- fast state-of-the-art speech models and a runtime that runs anywhere 💥☆55Updated 2 weeks ago
- Virtual environment stacks for Python☆258Updated last week
- LLM powered development for IntelliJ☆81Updated last year
- Inference of Large Multimodal Models in C/C++. LLaVA and others☆47Updated last year
- convert a saved pytorch model to gguf and generate as much corresponding ggml c code as possible☆14Updated last year
- Blazingly fast inference of diffusion models.☆108Updated 2 months ago
- "Pacha" TUI (Text User Interface) is a JavaScript application that utilizes the "blessed" library. It serves as a frontend for llama.cpp …☆36Updated last year
- Simple, Fast, Parallel Huggingface GGML model downloader written in python☆24Updated last year
- VSCode AI coding assistant powered by self-hosted llama.cpp endpoint.☆183Updated 4 months ago
- A CLI for piping outputs to ollama or just prompting☆54Updated 11 months ago
- A Pure Rust based LLM (Any LLM based MLLM such as Spark-TTS) Inference Engine, powering by Candle framework.☆126Updated last week
- A game of pong made by MetaGPT and ChatGPT Code Interpreter☆14Updated last year
- Forces DeepSeek R1 models to engage in extended reasoning by intercepting early termination tokens.☆19Updated 4 months ago
- General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). …☆49Updated 4 months ago
- Transformer GPU VRAM estimator☆65Updated last year
- Nexusflow function call, tool use, and agent benchmarks.☆20Updated 6 months ago
- 🎮 Material You TUI for monitoring NVIDIA GPUs☆50Updated 3 weeks ago
- Visual Studio Code extension for WizardCoder☆148Updated last year
- Display images in the terminal☆17Updated last year
- The AI agent script CLI for Programmable Prompt Engine.☆59Updated 2 months ago
- An OpenAI API compatible LLM inference server based on ExLlamaV2.☆25Updated last year