puppetm4st3r / baai_m3_simple_serverLinks

This code sets up a simple yet robust server using FastAPI for handling asynchronous requests for embedding generation and reranking tasks using the BAAI M3 multilingual model.

☆70

Alternatives and similar repositories for baai_m3_simple_server

Users that are interested in baai_m3_simple_server are comparing it to the libraries listed below

Sorting:

rag-wtf / open-text-embeddings
Open Source Text Embedding Models with OpenAI Compatible API
☆160Updated last year
jina-ai / late-chunking
Code for explaining and evaluating late chunking (chunked pooling)
☆453Updated 10 months ago
run-llama / finetune-embedding
Fine-Tuning Embedding for RAG with Synthetic Data
☆512Updated 2 years ago
kevaldekivadiya2415 / textembed
TextEmbed is a REST API crafted for high-throughput and low-latency embedding inference. It accommodates a wide variety of embedding mode…
☆25Updated last year
aurelio-labs / semantic-chunkers
☆237Updated 4 months ago
denser-org / denser-retriever
An enterprise-grade AI retriever designed to streamline AI integration into your applications, ensuring cutting-edge accuracy.
☆290Updated 4 months ago
isaacus-dev / semchunk
A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.
☆400Updated this week
PrithivirajDamodaran / FlashRank
Lite & Super-fast re-ranking for your search & retrieval pipelines. Supports SoTA Listwise and Pairwise reranking based on LLMs and cro…
☆877Updated last month
chu-tianxiang / vllm-gptq
A high-throughput and memory-efficient inference and serving engine for LLMs
☆131Updated last year
plaggy / rag-containers
Ready-to-go containerized RAG service. Implemented with text-embedding-inference + Qdrant/LanceDB.
☆73Updated 10 months ago
rizerphe / local-llm-function-calling
A tool for generating function arguments and choosing what function to call with local LLMs
☆432Updated last year
stephenleo / llm-structured-output-benchmarks
Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…
☆179Updated last year
runpod-workers / worker-vllm
The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
☆373Updated last week
run-llama / modal_finetune_sql
☆319Updated last year
IAAR-Shanghai / Meta-Chunking
Meta-Chunking: Learning Efficient Text Segmentation via Logical Perception
☆251Updated last month
khaimt / qa_expert
This repo is for handling Question Answering, especially for Multi-hop Question Answering
☆67Updated last year
castorini / rank_llm
RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.
☆545Updated this week
Yannael / multilingual-embeddings
☆65Updated last year
TIGER-AI-Lab / LongRAG
Official repo for "LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs".
☆240Updated last year
milvus-io / milvus-lite
A lightweight version of Milvus
☆381Updated 2 weeks ago
h2oai / enterprise-h2ogpte
Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform
☆91Updated last month
langchain-ai / text-split-explorer
☆267Updated 2 years ago
joaodsmarques / LumberChunker
This repository presents the original implementation of LumberChunker: Long-Form Narrative Document Segmentation by André V. Duarte, João…
☆80Updated last year
chentong0 / factoid-wiki
Dense X Retrieval: What Retrieval Granularity Should We Use?
☆163Updated last year
cckuailong / SuperAdapters
Finetune ALL LLMs with ALL Adapeters on ALL Platforms!
☆330Updated 3 months ago
liyucheng09 / Selective_Context
Compress your input to ChatGPT or other LLMs, to let them process 2x more content and save 40% memory and GPU time.
☆397Updated last year
mixedbread-ai / baguetter
Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…
☆190Updated last year
tensorchord / modelz-llm
OpenAI compatible API for LLMs and embeddings (LLaMA, Vicuna, ChatGLM and many others)
☆275Updated 2 years ago
arcee-ai / DALM
Domain Adapted Language Modeling Toolkit - E2E RAG
☆329Updated 11 months ago
THUDM / LongCite
LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA
☆507Updated 10 months ago