dmatora / LLM-inference-speed-benchmarks
☆18Updated 7 months ago
Alternatives and similar repositories for LLM-inference-speed-benchmarks
Users that are interested in LLM-inference-speed-benchmarks are comparing it to the libraries listed below
Sorting:
- Yet Another (LLM) Web UI, made with Gemini☆12Updated 4 months ago
- Trying to deconstruct RWKV in understandable terms☆14Updated 2 years ago
- Attend - to what matters.☆15Updated 2 months ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated this week
- Modified Beam Search with periodical restart☆12Updated 8 months ago
- A combination of Oobabooga's fork and the main cuda branch of GPTQ-for-LLaMa in a package format.☆22Updated last year
- ☆27Updated 8 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆19Updated 7 months ago
- ☆24Updated 3 months ago
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆49Updated 3 months ago
- ☆19Updated last month
- Build HTML artefacts with Ollama☆11Updated 5 months ago
- ☆16Updated last year
- ☆11Updated 3 months ago
- AirLLM 70B inference with single 4GB GPU☆12Updated 9 months ago
- Simple LLM inference server☆20Updated 11 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆34Updated 10 months ago
- A Windows tool to query various LLM AIs. Supports branched conversations, history and summaries among others.☆30Updated this week
- Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a sm…☆51Updated last week
- ☆9Updated 11 months ago
- Yet another frontend for LLM, written using .NET and WinUI 3☆10Updated 5 months ago
- Local LLM inference & management server with built-in OpenAI API☆31Updated last year
- A QT GUI for large language models☆34Updated last year
- The official Python library for Formulaic☆16Updated last year
- Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech G…☆23Updated last month
- Simple, Fast, Parallel Huggingface GGML model downloader written in python☆24Updated last year
- 5X faster 60% less memory QLoRA finetuning☆21Updated 11 months ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated 6 months ago
- LLM backed Fantasy Tribe Game☆18Updated 5 months ago
- MilimoChat: Privacy-first, self-hosted AI chat with customizable personas, context-aware memory, and local analytics. Built on Python/Str…☆12Updated 2 months ago