leoguillaume/vllm-embedding

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/leoguillaume/vllm-embedding)

leoguillaume / vllm-embedding

Deployment a light and full OpenAI API for production with vLLM to support /v1/embeddings with all embeddings models.

☆45

Alternatives and similar repositories for vllm-embedding

Users that are interested in vllm-embedding are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

DaRL-LibSignal / OpenTI
View on GitHub
IJMLC: Open-TI: Open Traffic Intelligence with Augmented Language Model
☆23Jul 30, 2025Updated 11 months ago
rag-wtf / open-text-embeddings
View on GitHub
Open Source Text Embedding Models with OpenAI Compatible API
☆171Jul 13, 2024Updated 2 years ago
vaguenebula / AlpacaDataReflect
View on GitHub
An experiment to see if chatgpt can improve the output of the stanford alpaca dataset
☆12Mar 29, 2023Updated 3 years ago
nytopop / illu
View on GitHub
realtime conversational dynamics
☆19Mar 19, 2025Updated last year
sujitpal / llm-rag-eval
View on GitHub
Large Language Model (LLM) powered evaluator for Retrieval Augmented Generation (RAG) pipelines.
☆41Apr 29, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
a-poor / watercooler
View on GitHub
WaterCooler is an open source, desktop GUI for interacting with ChatGPT, created with Tauri.
☆31Dec 28, 2023Updated 2 years ago
patrickvonplaten / Wav2Vec2_ParlanceCTCDecode
View on GitHub
☆11Nov 5, 2021Updated 4 years ago
chinaboard / whisperX-service
View on GitHub
WhisperX Service love docker!
☆18Aug 17, 2024Updated last year
ITKaven / RoBMRC
View on GitHub
☆10Mar 24, 2023Updated 3 years ago
opinionscience / BERTransfer
View on GitHub
A BERT-based application for reusable text classification at scale
☆37Jul 23, 2023Updated 3 years ago
zxqfl / flag
View on GitHub
☆18Dec 1, 2023Updated 2 years ago
CharlyCst / spadebox
View on GitHub
Sandboxed tools and JS runtime for AI agents
☆17Jul 13, 2026Updated 2 weeks ago
pchizhov / picky_bpe
View on GitHub
BPE modification that implements removing of the intermediate tokens during tokenizer training.
☆27Nov 25, 2024Updated last year
zuucan / NeedleInAHaystack-PLUS
View on GitHub
To assess the longtext capabilities more comprehensively, we propose Needle-in-a-Haystack PLUS, which shifts the focus from simple fact r…
☆13Mar 4, 2024Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
cdies / ML_microservice
View on GitHub
☆10Jun 19, 2022Updated 4 years ago
hcd233 / Aris-AI-Model-Server
View on GitHub
An OpenAI Compatible API which integrates LLM, Embedding and Reranker. 一个集成 LLM、Embedding 和 Reranker 的 OpenAI 兼容 API
☆18Aug 21, 2025Updated 11 months ago
NLPWM-WHU / SLGM
View on GitHub
Shen Zhou, Tieyun Qian: On the Strength of Sequence Labeling and Generative Models for Aspect Sentiment Triplet Extraction. Findings of A…
☆12May 26, 2023Updated 3 years ago
xjdr-alt / muzero_sketch
View on GitHub
☆40Jul 26, 2024Updated 2 years ago
lin2025 / gpt4
View on GitHub
LinGPT, a GPT-4 webpage with just a single HTML file. 只有一个html文件的GPT4聊天网页，零门槛，10秒搞定。多Key轮询 Auto Key Rotation 支持代理平台/第三方Key Supports proxy…
☆12Aug 28, 2023Updated 2 years ago
mlcommons / inference_results_v3.0
View on GitHub
This repository contains the results and code for the MLPerf™ Inference v3.0 benchmark.
☆19Jul 24, 2025Updated last year
tarekziade / mwcat
View on GitHub
MediaWiki Categories Model
☆13Feb 14, 2024Updated 2 years ago
u2d-ai / msaSDK
View on GitHub
FastAPI Microservices Architecture SDK - As Basis for multiple services in a platform/system
☆12Oct 4, 2022Updated 3 years ago
hi-paris / XPER
View on GitHub
A methodology designed to measure the contribution of the features to the predictive performance of any econometric or machine learning m…
☆18Jun 2, 2026Updated last month
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
blacktrub / cli-stream-chat
View on GitHub
The best terminal chat client for your live streams
☆19Jun 10, 2023Updated 3 years ago
andrewmcodes / dishwasher
View on GitHub
A CLI tool to help you easily delete forked repositories.
☆10May 16, 2026Updated 2 months ago
CarperAI / decontamination
View on GitHub
This repository contains code for cleaning your training data of benchmark data to help combat data snooping.
☆28Apr 21, 2023Updated 3 years ago
OpenMOSS / Embodied-Planner-R1
View on GitHub
Embodied-Planner-R1: Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning
☆27Mar 30, 2026Updated 3 months ago
ninehills / embedding_finetuning
View on GitHub
Fine-tuning embedding models.
☆14Nov 25, 2024Updated last year
wang-muhan / antigravity-interface
View on GitHub
☆29Dec 26, 2025Updated 7 months ago
freesunshine0316 / lab-conv-asa
View on GitHub
The project on Conversational Aspect Sentiment Analysis (CASA)
☆13Oct 8, 2022Updated 3 years ago
0xKoda / marketplace
View on GitHub
☆10Jan 10, 2025Updated last year
phanxuanquang / AI-Composer
View on GitHub
An AI assistant can help you with content composition right in your Microsoft Word
☆18Sep 10, 2024Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
UKPLab / arxiv2025-inherent-limits-plms
View on GitHub
Code repository for the paper "The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Le…
☆14Jan 16, 2025Updated last year
tapio / 7drl-2016
View on GitHub
Fantastic Dungeons - 7DRL 2016
☆10Mar 12, 2016Updated 10 years ago
etalab-ia / franceservices-backend
View on GitHub
Backend ressources for Albert. Albert is a conversational agent that uses official French data sources to answer administrative agents qu…
☆121Aug 6, 2025Updated 11 months ago
suhjohn / llm-workbench
View on GitHub
UI for testing prompts across various datasets locally
☆13Nov 2, 2024Updated last year
brendanhogan / completion_tree_view
View on GitHub
☆15Apr 26, 2025Updated last year
QuartzLibrary / glowpub
View on GitHub
A glowfic to epub converter.
☆14Jul 11, 2026Updated 2 weeks ago
camenduru / nvidia-llm-colab
View on GitHub
☆14Jul 25, 2023Updated 3 years ago