vllm混合推理扩展插件,支持多NUMA混合推理,单卡推理Qwen3-Next模型可达1000+ prefill
☆31Nov 7, 2025Updated 3 months ago
Alternatives and similar repositories for exvllm
Users that are interested in exvllm are comparing it to the libraries listed below
Sorting:
- CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter☆22May 28, 2025Updated 9 months ago
- A "standard library" of Triton kernels.☆22Oct 2, 2025Updated 5 months ago
- Yet Another Papers With Code☆35Sep 7, 2025Updated 5 months ago
- ☆11Feb 25, 2026Updated last week
- 用大模型批量处理数据,现支持各种大模型做OCR,支持通义千问, 月之暗面, 百度飞桨OCR, OpenAI 和LLAVA。Use LLM to generate or clean data for academic use. Support OCR with qwen, m…☆16Sep 15, 2024Updated last year
- Code for Robust Fine-tuning (RbFT)☆17Jan 31, 2025Updated last year
- ☆20Aug 30, 2024Updated last year
- ☆19Oct 9, 2024Updated last year
- ☆26May 11, 2025Updated 9 months ago
- A travel agent based on Qwen2.5, fine-tuned by SFT + DPO/PPO/GRPO using traveling question-answer dataset, a mindmap can be output using …☆56Nov 14, 2025Updated 3 months ago
- Revision of official yolov7-pose to support custom dataset for keypoint detection☆11Nov 12, 2023Updated 2 years ago
- 大模型推理框架加速,让 LLM 飞起来☆24May 10, 2024Updated last year
- ☆28Oct 14, 2024Updated last year
- Codebase for Instruction Following without Instruction Tuning☆36Sep 24, 2024Updated last year
- DeepSeek相关API不显示思考过程,通过本地部署项目解决。☆34Feb 13, 2025Updated last year
- ☆46Sep 26, 2025Updated 5 months ago
- ☆23Updated this week
- Difyで作る生成AIアプリ完全入門☆17May 25, 2025Updated 9 months ago
- Vstream - Video Analytics pipeline with Hardware based accelerations (dev - stage)☆10Feb 2, 2024Updated 2 years ago
- A simple WeChat Official Account layout tool based on Dify☆17Jun 27, 2025Updated 8 months ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆32May 29, 2024Updated last year
- ☆28Dec 4, 2025Updated 3 months ago
- Use yolov5 to realize the road occupation operation and vehicle parking violation detection in urban streets, and can independently delin…☆12Jan 2, 2023Updated 3 years ago
- Write the database metadata into the dify knowledge☆12Dec 30, 2025Updated 2 months ago
- ☆11Aug 29, 2025Updated 6 months ago
- A full-stack AI-powered business intelligence tool for non-experts, featuring serverless backend processing and a secure Streamlit fronte…☆28Feb 13, 2026Updated 3 weeks ago
- Workflow automation, but you just describe what you want and it happens.☆27Nov 22, 2025Updated 3 months ago
- 参考《上海交通大学生存手册》开源☆16Sep 25, 2024Updated last year
- HealthiVert-GAN, a novel deep-learning framework designed to generate pseudo-healthy vertebral images. These images simulate the pre-frac…☆11Nov 3, 2025Updated 4 months ago
- 使用CHATTTS合成语音,使用FASTAPI作为API服务端,基于GFAST制作了管理系统,提供了音色管理和webui界面☆35Jun 14, 2024Updated last year
- 博客信息☆42Updated this week
- An SSH plugin for Dify☆13Jan 16, 2026Updated last month
- ☆14May 1, 2023Updated 2 years ago
- ☆12Jun 28, 2024Updated last year
- A small framework to benchmark forecasting models via backtesting☆13Nov 25, 2023Updated 2 years ago
- A Multi-Session and Multi-Therapy Benchmark for High-Realism AI Psychological Counselor☆30Jan 13, 2026Updated last month
- An implementation of MSSRM method☆11Mar 23, 2023Updated 2 years ago
- 🤖AI Agents for Financial Trading💰: LLM-Driven Stock Prediction & Investment Recommendation System☆13Apr 14, 2025Updated 10 months ago
- ☆28Jun 27, 2025Updated 8 months ago