Codys12 / airllm
AirLLM 70B inference with single 4GB GPU
☆12Updated 7 months ago
Alternatives and similar repositories for airllm:
Users that are interested in airllm are comparing it to the libraries listed below
- ☆24Updated 2 months ago
- Yet Another (LLM) Web UI, made with Gemini☆11Updated 3 months ago
- Chat WebUI is an easy-to-use user interface for interacting with AI, and it comes with multiple useful built-in tools.☆21Updated 3 weeks ago
- Local LLM inference & management server with built-in OpenAI API☆31Updated 11 months ago
- run ollama & gguf easily with a single command☆49Updated 10 months ago
- Controllable Language Model Interactions in TypeScript☆9Updated 10 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆32Updated 8 months ago
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆15Updated 6 months ago
- Proteus is an experimental platform that combines the power of Large Language Models with the Genesis physics engine☆21Updated 3 months ago
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆56Updated 3 months ago
- Create text chunks which end at natural stopping points without using a tokenizer☆26Updated last week
- ☆27Updated 6 months ago
- an auto-sleeping and -waking framework around llama.cpp☆11Updated last month
- Editor with LLM generation tree exploration☆65Updated last month
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆86Updated 3 months ago
- Built for demanding AI workflows, this gateway offers low-latency, provider-agnostic access, ensuring your AI applications run smoothly a…☆47Updated 2 weeks ago
- ☆15Updated this week
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆49Updated last month
- convert a saved pytorch model to gguf and generate as much corresponding ggml c code as possible☆14Updated last year
- Modified Beam Search with periodical restart☆12Updated 6 months ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆59Updated 7 months ago
- Who needs o1 anyways. Add CoT to any OpenAI compatible endpoint.☆41Updated 6 months ago
- An OpenAI API compatible LLM inference server based on ExLlamaV2.☆25Updated last year
- Yet another frontend for LLM, written using .NET and WinUI 3☆10Updated 4 months ago
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?☆21Updated 9 months ago
- A python package for serving LLM on OpenAI-compatible API endpoints with prompt caching using MLX.☆73Updated 3 months ago
- Attend - to what matters.☆14Updated last month
- fast state-of-the-art speech models and a runtime that runs anywhere 💥☆55Updated last month
- Serving LLMs in the HF-Transformers format via a PyFlask API☆71Updated 6 months ago
- MilimoChat: Privacy-first, self-hosted AI chat with customizable personas, context-aware memory, and local analytics. Built on Python/Str…☆11Updated last week