dmatora / LLM-inference-speed-benchmarksLinks
☆18Updated 9 months ago
Alternatives and similar repositories for LLM-inference-speed-benchmarks
Users that are interested in LLM-inference-speed-benchmarks are comparing it to the libraries listed below
Sorting:
- Trying to deconstruct RWKV in understandable terms☆14Updated 2 years ago
- A combination of Oobabooga's fork and the main cuda branch of GPTQ-for-LLaMa in a package format.☆22Updated last year
- Yet Another (LLM) Web UI, made with Gemini☆12Updated 6 months ago
- ☆28Updated 10 months ago
- Simple, Fast, Parallel Huggingface GGML model downloader written in python☆24Updated last year
- LLM Divergent Thinking Creativity Benchmark. LLMs generate 25 unique words that start with a given letter with no connections to each oth…☆31Updated 3 months ago
- Lightweight continuous batching OpenAI compatibility using HuggingFace Transformers include T5 and Whisper.☆26Updated 3 months ago
- 33B Chinese LLM, DPO QLORA, 100K context, AirLLM 70B inference with single 4GB GPU☆13Updated last year
- AirLLM 70B inference with single 4GB GPU☆14Updated 2 weeks ago
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated this week
- convert a saved pytorch model to gguf and generate as much corresponding ggml c code as possible☆15Updated last year
- GoldFinch and other hybrid transformer components☆10Updated this week
- A converter and basic tester for rwkv onnx☆42Updated last year
- Modified Beam Search with periodical restart☆12Updated 10 months ago
- Controllable Language Model Interactions in TypeScript☆9Updated last year
- Experimental sampler to make LLMs more creative☆31Updated last year
- Training hybrid models for dummies.☆23Updated 5 months ago
- PowerShell automation to rebuild llama.cpp for a Windows environment.☆32Updated last month
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆57Updated 7 months ago
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆49Updated 5 months ago
- OpenPipe Reinforcement Learning Experiments☆25Updated 4 months ago
- ☆22Updated 11 months ago
- ☆11Updated last month
- This repository is about implementing The Personality Cores Conversation System originally developed by Aperture Science, Inc. in the Por…☆25Updated last year
- run ollama & gguf easily with a single command☆52Updated last year
- ☆31Updated last year
- V.I.S.O.R., my in-development AI-powered voice assistant with integrated memory!☆37Updated 2 months ago
- BUD-E (Buddy) is an open-source voice assistant framework that facilitates seamless interaction with AI models and APIs, enabling the cre…☆36Updated 11 months ago
- BlinkDL's RWKV-v4 running in the browser☆47Updated 2 years ago
- Experiments with BitNet inference on CPU☆54Updated last year