☆16Nov 24, 2025Updated 3 months ago
Alternatives and similar repositories for vllm
Users that are interested in vllm are comparing it to the libraries listed below
Sorting:
- This is a fork of SGLang for hip-attention integration. Please refer to hip-attention for detail.☆18Dec 23, 2025Updated 2 months ago
- ☆14Dec 5, 2025Updated 2 months ago
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆42Jan 15, 2024Updated 2 years ago
- ☆12Jun 19, 2024Updated last year
- ☆16Nov 26, 2024Updated last year
- A minimal re-implementation of orthogonal fine-tuning (OFT) for LLMs. Based on nanoGPT and minLoRA.☆13Nov 17, 2023Updated 2 years ago
- Video scrubbing with WebCodecs☆15Nov 4, 2025Updated 3 months ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆11Dec 13, 2023Updated 2 years ago
- Better Live Text for MacOS☆33Feb 8, 2026Updated 3 weeks ago
- Chain-of-thought 방식을 활용하여 llama2를 fine-tuning☆10Nov 18, 2023Updated 2 years ago
- Implementation of the ACL Findings paper "OutFlip: Generating Examples for Unknown Intent Detection with Natural Language Attack"☆10May 24, 2021Updated 4 years ago
- ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)☆275Feb 10, 2026Updated 2 weeks ago
- ☆11Sep 19, 2025Updated 5 months ago
- ☆14Mar 21, 2019Updated 6 years ago
- Training NVIDIA NeMo Megatron Large Language Model (LLM) using NeMo Framework on Google Kubernetes Engine☆16Apr 28, 2025Updated 10 months ago
- Tensorflow object detection api in single line☆12Dec 23, 2021Updated 4 years ago
- Fast and memory-efficient exact attention☆18Feb 23, 2026Updated last week
- GPT-J 6B inference on TensorRT with INT-8 precision☆11Apr 5, 2023Updated 2 years ago
- cursor logs with gpt-4o using litellm proxy☆14Sep 9, 2025Updated 5 months ago
- code for training and using chess embeddings models☆13Jun 9, 2024Updated last year
- Sample CloudFormation template to create spot fleet request☆11Mar 23, 2016Updated 9 years ago
- ☆12Aug 27, 2020Updated 5 years ago
- A library to encode text as DNA and decode DNA to text.☆13Nov 21, 2022Updated 3 years ago
- A fork of the PEFT library, supporting Robust Adaptation (RoSA)☆15Aug 16, 2024Updated last year
- Convert your text to emoji☆12Jun 27, 2018Updated 7 years ago
- Chunk Dedupe Estimation☆20Nov 5, 2024Updated last year
- A small application that summarizes conversation in a Discord channel☆19Oct 27, 2025Updated 4 months ago
- ScrollNet for Continual Learning☆11Sep 11, 2023Updated 2 years ago
- ☆16Mar 3, 2024Updated 2 years ago
- GPT-jax based on the official huggingface library☆13Jun 22, 2021Updated 4 years ago
- A framework for benchmarking embedding models in hybrid search scenarios (BM25 + vector search) using Weaviate.☆38Feb 12, 2026Updated 2 weeks ago
- ☆16Apr 3, 2024Updated last year
- ASR on WS, POST/GET FAST_API Can use many RU asr models.☆18Jan 27, 2026Updated last month
- code of SOE-Net released in ICCV 2017☆15May 26, 2020Updated 5 years ago
- a simple seqseq-autoencoder example of tensorflow☆13Sep 7, 2016Updated 9 years ago
- Let Me Run That For You: A C++20 Thread Pool Library☆12Feb 18, 2022Updated 4 years ago
- ML Reproducibility Challenge 2020: Electra reimplementation using PyTorch and Transformers☆12Apr 16, 2021Updated 4 years ago
- Demo repository for all the different ways to do eBPF Tracing☆17Feb 9, 2026Updated 3 weeks ago
- Rabbitmq operator for kubernetes☆13Jun 8, 2020Updated 5 years ago