mistralai / vllm-release
A high-throughput and memory-efficient inference and serving engine for LLMs
☆50Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for vllm-release
- A new benchmark for measuring LLM's capability to detect bugs in large codebase.☆27Updated 5 months ago
- ☆64Updated 5 months ago
- inference code for mixtral-8x7b-32kseqlen☆98Updated 11 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated 10 months ago
- ☆104Updated 8 months ago
- ☆150Updated 4 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆97Updated last year
- An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fast☆137Updated 2 months ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆221Updated 6 months ago
- This is our own implementation of 'Layer Selective Rank Reduction'☆232Updated 5 months ago
- Just a bunch of benchmark logs for different LLMs☆115Updated 3 months ago
- prime is a framework for efficient, globally distributed training of AI models over the internet.☆212Updated this week
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆62Updated 2 weeks ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆64Updated this week
- Data preparation code for Amber 7B LLM☆83Updated 6 months ago
- Distributed Inference for mlx LLm☆70Updated 3 months ago
- Official homepage for "Self-Harmonized Chain of Thought"☆83Updated 2 months ago
- Simple examples using Argilla tools to build AI☆42Updated this week
- Manage scalable open LLM inference endpoints in Slurm clusters☆237Updated 4 months ago
- ☆200Updated 9 months ago
- ☆106Updated 2 months ago
- A toolkit for building multimodal AI agents☆111Updated this week
- Scripts to create your own moe models using mlx☆86Updated 8 months ago
- The implementation of "Leeroo Orchestrator: Elevating LLMs Performance Through Model Integration"☆51Updated 6 months ago
- Routing on Random Forest (RoRF)☆84Updated last month
- GPT-4 Level Conversational QA Trained In a Few Hours☆55Updated 3 months ago
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hub☆155Updated last year
- look how they massacred my boy☆58Updated last month
- ☆72Updated last year
- ☆48Updated last year