dmatora / LLM-inference-speed-benchmarksLinks
☆19Updated 10 months ago
Alternatives and similar repositories for LLM-inference-speed-benchmarks
Users that are interested in LLM-inference-speed-benchmarks are comparing it to the libraries listed below
Sorting:
- AirLLM 70B inference with single 4GB GPU☆14Updated last month
- run ollama & gguf easily with a single command☆52Updated last year
- 33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU☆13Updated last year
- Trying to deconstruct RWKV in understandable terms☆14Updated 2 years ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆36Updated last year
- LLM Divergent Thinking Creativity Benchmark. LLMs generate 25 unique words that start with a given letter with no connections to each oth…☆33Updated 4 months ago
- ☆24Updated 6 months ago
- Attend - to what matters.☆17Updated 5 months ago
- V.I.S.O.R., my in-development AI-powered voice assistant with integrated memory!☆38Updated last week
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆55Updated 8 months ago
- Chat WebUI is an easy-to-use user interface for interacting with AI, and it comes with multiple useful built-in tools.☆32Updated last month
- Local LLM inference & management server with built-in OpenAI API☆31Updated last year
- Modified Beam Search with periodical restart☆12Updated 11 months ago
- Loader extension for tabbyAPI in SillyTavern☆27Updated last month
- PowerShell automation to rebuild llama.cpp for a Windows environment.☆32Updated this week
- Accepts a Hugging Face model URL, automatically downloads and quantizes it using Bits and Bytes.☆38Updated last year
- Experimental sampler to make LLMs more creative☆31Updated 2 years ago
- A simple GUI utility for gathering LIMA-like chat data.☆23Updated 5 months ago
- Senna is an advanced AI-powered search engine designed to provide users with immediate answers to their queries by leveraging natural lan…☆19Updated 11 months ago
- Who needs o1 anyways. Add CoT to any OpenAI compatible endpoint.☆43Updated 10 months ago
- Yet Another (LLM) Web UI, made with Gemini☆12Updated 7 months ago
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?☆22Updated last year
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated 9 months ago
- convert a saved pytorch model to gguf and generate as much corresponding ggml c code as possible☆15Updated last year
- OpenPipe Reinforcement Learning Experiments☆30Updated 4 months ago
- RetroChat is a powerful command-line interface for interacting with various AI language models. It provides a seamless experience for eng…☆77Updated 3 weeks ago
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆51Updated 6 months ago
- "Pacha" TUI (Text User Interface) is a JavaScript application that utilizes the "blessed" library. It serves as a frontend for llama.cpp …☆36Updated 2 years ago
- Note about running ollama 🦙☆35Updated last year
- A Windows tool to query various LLM AIs. Supports branched conversations, history and summaries among others.☆33Updated last week