dmatora / LLM-inference-speed-benchmarks
☆17Updated 5 months ago
Alternatives and similar repositories for LLM-inference-speed-benchmarks:
Users that are interested in LLM-inference-speed-benchmarks are comparing it to the libraries listed below
- Yet Another (LLM) Web UI, made with Gemini☆11Updated 3 months ago
- Attend - to what matters.☆14Updated last month
- AirLLM 70B inference with single 4GB GPU☆12Updated 7 months ago
- The official Python library for Formulaic☆16Updated 11 months ago
- Chat WebUI is an easy-to-use user interface for interacting with AI, and it comes with multiple useful built-in tools.☆21Updated last month
- Modified Beam Search with periodical restart☆12Updated 6 months ago
- ☆27Updated 7 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆32Updated 8 months ago
- ☆24Updated 2 months ago
- run ollama & gguf easily with a single command☆50Updated 10 months ago
- Simple, Fast, Parallel Huggingface GGML model downloader written in python☆24Updated last year
- 33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU☆13Updated 10 months ago
- Uses a Gradio interface to stream coding related responses from local and cloud based large language models. Pulls context from GitHub Re…☆20Updated 2 weeks ago
- Local LLM inference & management server with built-in OpenAI API☆31Updated 11 months ago
- OpenPipe Reinforcement Learning Experiments☆21Updated 2 weeks ago
- ☆17Updated 2 months ago
- Build HTML artefacts with Ollama☆11Updated 3 months ago
- MilimoChat: Privacy-first, self-hosted AI chat with customizable personas, context-aware memory, and local analytics. Built on Python/Str…☆11Updated 2 weeks ago
- A proxy that hosts multiple single-model runners such as LLama.cpp and vLLM☆12Updated this week
- Trying to deconstruct RWKV in understandable terms☆14Updated last year
- LLM Chat is an open-source serverless alternative to ChatGPT.☆33Updated 6 months ago
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆56Updated 3 months ago
- Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech G…☆21Updated this week
- V.I.S.O.R., my in-development AI-powered voice assistant with integrated memory!☆35Updated this week
- ☆22Updated 7 months ago
- ☆17Updated last week
- A Windows tool to query various LLM AIs. Supports branched conversations, history and summaries among others.☆29Updated this week
- ☆16Updated last year
- A chat UI for Llama.cpp☆12Updated 2 weeks ago
- A combination of Oobabooga's fork and the main cuda branch of GPTQ-for-LLaMa in a package format.☆22Updated last year