dmatora / LLM-inference-speed-benchmarks
☆14Updated last month
Related projects ⓘ
Alternatives and complementary repositories for LLM-inference-speed-benchmarks
- Integrates AI tools into Microsoft® Word® (independently developed, not affiliated with Microsoft)☆36Updated last month
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆35Updated 3 weeks ago
- ☆20Updated 2 months ago
- A QT GUI for large language models☆24Updated 10 months ago
- AirLLM 70B inference with single 4GB GPU☆12Updated 3 months ago
- ☆18Updated 3 weeks ago
- Chat WebUI is an easy-to-use user interface for interacting with AI, and it comes with multiple useful built-in tools.☆14Updated last week
- Experimental sampler to make LLMs more creative☆30Updated last year
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆22Updated this week
- Local LLM inference & management server with built-in OpenAI API☆31Updated 7 months ago
- ☆25Updated 2 months ago
- A Windows tool to query various LLM AIs. Supports branched conversations, history and summaries among others.☆28Updated 2 weeks ago
- ☆21Updated 3 months ago
- Simple LLM inference server☆18Updated 5 months ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆23Updated last week
- Tcurtsni: Reverse Instruction Chat, ever wonder what your LLM wants to ask you?☆21Updated 4 months ago
- Large-Language-Model to Machine Interface project.☆17Updated 11 months ago
- V.I.S.O.R., my in-development AI-powered voice assistant with integrated memory!☆28Updated last week
- Training hybrid models for dummies.☆15Updated 3 weeks ago
- Modified Beam Search with periodical restart☆12Updated 2 months ago
- LangChain + LiteLLM that works☆25Updated 3 weeks ago
- A quick and optimized solution to manage llama based gguf quantized models, download gguf files, retreive messege formatting, add more mo…☆12Updated 10 months ago
- ☆27Updated last year
- Python module that creates a context map for AI code generation☆14Updated 3 months ago
- idea: https://github.com/nyxkrage/ebook-groupchat/☆82Updated 3 months ago
- 5X faster 60% less memory QLoRA finetuning☆21Updated 5 months ago
- 4 million public stable diffusion prompts -- interactive neural search and llama chat☆19Updated 2 months ago
- A simple light terminal style chat app that lets you use connect to your local llama.cpp server☆27Updated 4 months ago
- Accepts a Hugging Face model URL, automatically downloads and quantizes it using Bits and Bytes.☆38Updated 8 months ago
- Note about running ollama 🦙☆30Updated 6 months ago