A high-throughput and memory-efficient inference and serving engine for LLMs
☆18Nov 24, 2023Updated 2 years ago
Alternatives and similar repositories for vllm
Users that are interested in vllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Shaping Language Models with Cognitive Insights☆15Feb 29, 2024Updated 2 years ago
- the benchmark for finance☆11Jul 4, 2023Updated 2 years ago
- This is AlpaGasus2-QLoRA based on LLaMA2 with AlpaGasus mechanism using QLoRA!☆15Nov 22, 2023Updated 2 years ago
- Ultra-Fine Entity Typing with Weak Supervision from a Masked Language Model☆18Aug 2, 2021Updated 4 years ago
- ☆14Jun 11, 2024Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [CVPR'26] UniGame code implementation☆20Apr 21, 2026Updated last month
- Unsupervised tableQA and databaseQA on chinese finance question and tabular data☆13Apr 20, 2023Updated 3 years ago
- ☆21May 22, 2023Updated 3 years ago
- A Model Agnostic function to directly remove specified layers from the LLM☆10May 23, 2024Updated 2 years ago
- ☆42May 9, 2024Updated 2 years ago
- accelerate generating vector by using onnx model☆18Jan 23, 2024Updated 2 years ago
- A Multi-Format Transfer Learning Model for Event Argument Extraction via Variational Information Bottleneck☆10Sep 9, 2022Updated 3 years ago
- ChatYuan-7B☆13Jun 16, 2023Updated 3 years ago
- ☆11Mar 12, 2021Updated 5 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Reproduced the DFT method without using Verl. https://arxiv.org/abs/2508.05629☆23Oct 14, 2025Updated 8 months ago
- ☆10Aug 15, 2023Updated 2 years ago
- Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate Speech (ACL-IJCNLP 2021 Findings)☆13Jun 22, 2022Updated 3 years ago
- ☆16May 16, 2025Updated last year
- An implementation for MLLM oversensitivity evaluation☆18Nov 16, 2024Updated last year
- PDF table extraction☆10Dec 14, 2021Updated 4 years ago
- Source of BLAS and LAPACK via the Accelerate framework☆18May 7, 2023Updated 3 years ago
- ☆12Jun 19, 2025Updated last year
- Open-source Human Feedback Library☆11Oct 25, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆17Nov 3, 2024Updated last year
- Zero-shot entity linking with less data☆15Aug 1, 2022Updated 3 years ago
- ☆12Apr 29, 2024Updated 2 years ago
- ☆14Jul 11, 2024Updated last year
- implementation of http://arxiv.org/pdf/1511.06391v4.pdf in keras☆13Oct 3, 2016Updated 9 years ago
- ☆23Feb 8, 2025Updated last year
- java写的socks5翻墙工具☆13Mar 13, 2015Updated 11 years ago
- Code for paper "Open Relation and Event Type Discovery with Type Abstraction". EMNLP 22'☆15Nov 30, 2022Updated 3 years ago
- [ICLR 2026] The official implementation of the paper “Anchored Supervised Fine-Tuning”☆44May 8, 2026Updated last month
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- The software associated with a paper accepted at EMNLP 2021 titled "Open Knowledge Graphs Canonicalization using Variational Autoencoders…☆16Sep 27, 2021Updated 4 years ago
- A new algorithm that formulates jailbreaking as a reasoning problem.☆26Jul 2, 2025Updated 11 months ago
- Unofficial PyTorch implementation of the paper "Multi-Label Image Recognition with Graph Convolutional Networks"☆10Feb 19, 2023Updated 3 years ago
- 2022 WAIC 黑客松蚂蚁财富赛道:AntSQL大规模金融语义解析中文Text-to-SQL挑战赛 一位萌新的代码 嘻嘻嘻☆14Mar 11, 2023Updated 3 years ago
- ☆31Sep 12, 2025Updated 9 months ago
- Data and code for the paper "The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems"☆21Jul 18, 2023Updated 2 years ago
- The PreTENS shared task hosted at SemEval 2022 aims at focusing on semantic competence with specific attention on the evaluation of langu…☆12Feb 5, 2022Updated 4 years ago