dmatora / LLM-inference-speed-benchmarksLinks
☆20Updated 11 months ago
Alternatives and similar repositories for LLM-inference-speed-benchmarks
Users that are interested in LLM-inference-speed-benchmarks are comparing it to the libraries listed below
Sorting:
- Trying to deconstruct RWKV in understandable terms☆14Updated 2 years ago
- run ollama & gguf easily with a single command☆52Updated last year
- ☆24Updated 8 months ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆82Updated 2 weeks ago
- ☆62Updated 2 months ago
- Modified Beam Search with periodical restart☆12Updated last year
- Yet Another (LLM) Web UI, made with Gemini☆12Updated 9 months ago
- Yet another frontend for LLM, written using .NET and WinUI 3☆10Updated 2 weeks ago
- OpenPipe Reinforcement Learning Experiments☆31Updated 6 months ago
- Senna is an advanced AI-powered search engine designed to provide users with immediate answers to their queries by leveraging natural lan…☆19Updated last year
- Local LLM inference & management server with built-in OpenAI API☆31Updated last year
- Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a sm…☆63Updated last week
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆57Updated 9 months ago
- AirLLM 70B inference with single 4GB GPU☆14Updated 3 months ago
- Chat WebUI is an easy-to-use user interface for interacting with AI, and it comes with multiple useful built-in tools such as web search …☆45Updated 3 weeks ago
- ☆51Updated last year
- A sleek, customizable interface for managing LLMs with responsive design and easy agent personalization.☆16Updated last year
- Various LLM Benchmarks☆24Updated last month
- Easy to use, High Performant Knowledge Distillation for LLMs☆93Updated 4 months ago
- LLM Divergent Thinking Creativity Benchmark. LLMs generate 25 unique words that start with a given letter with no connections to each oth…☆34Updated 6 months ago
- A Windows tool to query various LLM AIs. Supports branched conversations, history and summaries among others.☆33Updated 2 weeks ago
- 33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU☆13Updated last year
- GGUF Quantization of any LLM.☆40Updated last year
- ☆22Updated last month
- Attend - to what matters.☆17Updated 7 months ago
- Who needs o1 anyways. Add CoT to any OpenAI compatible endpoint.☆44Updated last year
- LLM based agents with proactive interactions, long-term memory, external tool integration, and local deployment capabilities.☆106Updated 2 months ago
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?☆23Updated last year
- Cortex.Tensorrt-LLM is a C++ inference library that can be loaded by any server at runtime. It submodules NVIDIA’s TensorRT-LLM for GPU a…☆42Updated last year
- Simple, Fast, Parallel Huggingface GGML model downloader written in python☆24Updated 2 years ago