A high-throughput and memory-efficient inference and serving engine for LLMs
☆18Nov 24, 2023Updated 2 years ago
Alternatives and similar repositories for vllm
Users that are interested in vllm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This is AlpaGasus2-QLoRA based on LLaMA2 with AlpaGasus mechanism using QLoRA!☆15Nov 22, 2023Updated 2 years ago
- Ultra-Fine Entity Typing with Weak Supervision from a Masked Language Model☆18Aug 2, 2021Updated 4 years ago
- ☆13Jun 11, 2024Updated last year
- A Model Agnostic function to directly remove specified layers from the LLM☆10May 23, 2024Updated last year
- ☆43May 9, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- accelerate generating vector by using onnx model☆18Jan 23, 2024Updated 2 years ago
- A Multi-Format Transfer Learning Model for Event Argument Extraction via Variational Information Bottleneck☆10Sep 9, 2022Updated 3 years ago
- Research project on glyph-based Chinese character embedding. Preparing for EMNLP 2019☆11Mar 18, 2019Updated 7 years ago
- Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate Speech (ACL-IJCNLP 2021 Findings)☆13Jun 22, 2022Updated 3 years ago
- Unconditional Geomodeling related work (codes, data, and results)☆17Jan 4, 2023Updated 3 years ago
- ☆16May 16, 2025Updated 10 months ago
- 📰 Named entitity recognition (NER) and Entity linking (EL) on the dataset of Patents☆16Jun 5, 2022Updated 3 years ago
- code for "Fine-grained Entity Typing via Label Reasoning" EMNLP2021☆13May 27, 2022Updated 3 years ago
- PDF table extraction☆10Dec 14, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- this is a high performance cuda porting of cbow model of word2vec☆17Sep 14, 2014Updated 11 years ago
- Source of BLAS and LAPACK via the Accelerate framework☆18May 7, 2023Updated 2 years ago
- [ACL'25 Oral] What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective☆76Jun 25, 2025Updated 9 months ago
- ☆12Jun 19, 2025Updated 9 months ago
- Contextual Retrieval solves this problem by prepending chunk-specific explanatory context to each chunk before embedding (“Contextual Emb…☆28Sep 29, 2024Updated last year
- Zero-shot entity linking with less data☆15Aug 1, 2022Updated 3 years ago
- [ICLR 2026] The official implementation of the paper “Anchored Supervised Fine-Tuning”☆35Feb 12, 2026Updated last month
- ☆21Feb 8, 2025Updated last year
- Code for paper "Open Relation and Event Type Discovery with Type Abstraction". EMNLP 22'☆16Nov 30, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Evaluation utilities based on SymPy.☆22Dec 12, 2024Updated last year
- 🔱 A naive tool for AssetBundles exploring.☆11Nov 22, 2022Updated 3 years ago
- ☆20Jan 4, 2023Updated 3 years ago
- 2022 WAIC 黑客松蚂蚁财富赛道:AntSQL大规模金融语义解析中文Text-to-SQL挑战赛 一位萌新的代码 嘻嘻嘻☆13Mar 11, 2023Updated 3 years ago
- Using tensorflow to create a recommendation engine with DNN☆16Jul 30, 2016Updated 9 years ago
- Code for paper "Parameter Efficient Multi-task Model Fusion with Partial Linearization"☆25Sep 13, 2024Updated last year
- ☆31Sep 12, 2025Updated 6 months ago
- The PreTENS shared task hosted at SemEval 2022 aims at focusing on semantic competence with specific attention on the evaluation of langu…☆12Feb 5, 2022Updated 4 years ago
- A highly contextualized retrieval system integrating Large Language Models (LLMs), embeddings, and a dynamic agent-driven framework. Supp…☆27Sep 24, 2025Updated 6 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Modular and Simple approach to VQA in Keras☆21Sep 6, 2017Updated 8 years ago
- TextPy: Collaborative Agent Workflow through Programming and Prompting☆27May 9, 2025Updated 10 months ago
- Unified Representations of Structured and Unstructured Knowledge for Open-Domain Question Answering☆50Aug 2, 2022Updated 3 years ago
- ☆18May 21, 2018Updated 7 years ago
- Keras implementation of hierarchical attention network for document classification with options to predict and present attention weights …☆19Apr 11, 2019Updated 6 years ago
- Simple table extraction example.☆10Jun 26, 2022Updated 3 years ago
- An unofficial implementation of the Infini-gram model proposed by Liu et al. (2024)☆33Jun 19, 2024Updated last year