dmatora / LLM-inference-speed-benchmarks
☆18Updated 6 months ago
Alternatives and similar repositories for LLM-inference-speed-benchmarks:
Users that are interested in LLM-inference-speed-benchmarks are comparing it to the libraries listed below
- run ollama & gguf easily with a single command☆50Updated 11 months ago
- A combination of Oobabooga's fork and the main cuda branch of GPTQ-for-LLaMa in a package format.☆22Updated last year
- AirLLM 70B inference with single 4GB GPU☆12Updated 8 months ago
- Simple, Fast, Parallel Huggingface GGML model downloader written in python☆24Updated last year
- ☆27Updated 8 months ago
- ☆19Updated last month
- Yet Another (LLM) Web UI, made with Gemini☆11Updated 4 months ago
- Lightweight OpenAI wrapper using FastAPI. Add rate limits to OpenAI usage, optionally log and store all API calls, and share regulated Op…☆13Updated last year
- Modified Beam Search with periodical restart☆12Updated 7 months ago
- ☆24Updated 3 months ago
- Attend - to what matters.☆14Updated 2 months ago
- A repository to store helpful information and emerging insights in regard to LLMs☆20Updated last year
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated last week
- ☆11Updated 2 months ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated 5 months ago
- A QT GUI for large language models☆32Updated last year
- The official Python library for Formulaic☆16Updated last year
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆19Updated 6 months ago
- ☆22Updated 8 months ago
- Trying to deconstruct RWKV in understandable terms☆14Updated last year
- Large-Language-Model to Machine Interface project.☆18Updated last year
- Controllable Language Model Interactions in TypeScript☆9Updated 11 months ago
- Probably one of the lightest native RAG + Agent apps out there,experience the power of Agent-powered models and Agent-driven knowledge ba…☆26Updated this week
- GoldFinch and other hybrid transformer components☆10Updated 3 weeks ago
- Note about running ollama 🦙☆35Updated 11 months ago
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆49Updated 2 months ago
- Uses a Gradio interface to stream coding related responses from local and cloud based large language models. Pulls context from GitHub Re…☆21Updated last month
- LLM backed Fantasy Tribe Game☆18Updated 5 months ago
- Makes llama.cpp easy to use.☆12Updated last year
- Yet another frontend for LLM, written using .NET and WinUI 3☆10Updated 5 months ago