mistralai / vllm-releaseLinks

A high-throughput and memory-efficient inference and serving engine for LLMs

☆53

Alternatives and similar repositories for vllm-release

Users that are interested in vllm-release are comparing it to the libraries listed below

Sorting:

kyegomez / Andromeda
An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fast
☆151Updated 11 months ago
argilla-io / notus
Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…
☆168Updated last year
Gryphe / BlockMerge_Gradient
Merge Transformers language models by use of gradient parameters.
☆206Updated last year
QuixiAI / kraken
☆66Updated last year
Leeroo-AI / leeroo_orchestrator
The implementation of "Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration"
☆56Updated last year
QuixiAI / laserRMT
This is our own implementation of 'Layer Selective Rank Reduction'
☆239Updated last year
emrgnt-cmplxty / zero-shot-replication
☆74Updated last year
Preemo-Inc / text-generation-inference
☆199Updated last year
automix-llm / automix
Mixing Language Models with Self-Verification and Meta-Verification
☆105Updated 7 months ago
4dh / GRDN
GRDN.AI app for garden optimization
☆70Updated last year
NousResearch / Obsidian
Maybe the new state of the art vision model? we'll see 🤷‍♂️
☆167Updated last year
TheBlokeAI / AIScripts
Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub
☆162Updated last year
sdan / selfextend
an implementation of Self-Extend, to expand the context window via grouped attention
☆119Updated last year
teknium1 / LLM-Benchmark-Logs
Just a bunch of benchmark logs for different LLMs
☆119Updated last year
mzbac / mlx-moe
Scripts to create your own moe models using mlx
☆90Updated last year
promptslab / LLMtuner
FineTune LLMs in few lines of code (Text2Text, Text2Speech, Speech2Text)
☆240Updated last year
teknium1 / ShareGPT-Builder
☆116Updated 7 months ago
Xalp / ECHO
Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)
☆91Updated 6 months ago
QuixiAI / OpenChatML
☆157Updated last year
premAI-io / benchmarks
🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
☆137Updated last year
anyscale / llm-router
Tutorial for building LLM router
☆221Updated last year
Alignment-Lab-AI / Our-Projects
A repository of projects and datasets under active development by Alignment Lab AI
☆22Updated last year
QuixiAI / SystemChat
☆30Updated last year
thomasgauthier / LoRD
Low-Rank adapter extraction for fine-tuned transformers models
☆175Updated last year
vikhyat / mixtral-inference
inference code for mixtral-8x7b-32kseqlen
☆101Updated last year
migtissera / Sensei
Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI
☆222Updated last year
LLM360 / amber-data-prep
Data preparation code for Amber 7B LLM
☆91Updated last year
discus-labs / discus
A data-centric AI package for ML/AI. Get the best high-quality data for the best results. Discord: https://discord.gg/t6ADqBKrdZ
☆63Updated last year
log10-io / log10
Python client library for improving your LLM app accuracy
☆98Updated 5 months ago
LudwigStumpp / llm-leaderboard
A joint community effort to create one central leaderboard for LLMs.
☆304Updated 11 months ago