substratusai / stapi
Sentence Transformers API: An OpenAI compatible embedding API server
☆24Updated 2 weeks ago
Related projects: ⓘ
- Efficient few-shot learning with cross-encoders.☆35Updated 7 months ago
- C++ inference wrappers for running blazing fast embedding services on your favourite serverless like AWS Lambda. By Prithivi Da, PRs welc…☆21Updated 6 months ago
- Evaluation of bm42 sparse indexing algorithm☆60Updated 2 months ago
- ☆20Updated 7 months ago
- Python API for https://vespa.ai, the open big data serving engine☆89Updated this week
- Deployment a light and full OpenAI API for production with vLLM to support /v1/embeddings with all embeddings models.☆32Updated 2 months ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆101Updated last week
- Ready-to-go containerized RAG service. Implemented with text-embedding-inference + Qdrant/LanceDB.☆40Updated 6 months ago
- Evaluation for AI apps and agent☆35Updated 8 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆64Updated 2 months ago
- ASR + diarization model server with speculative decoding☆46Updated 3 months ago
- ☆20Updated 3 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆20Updated 7 months ago
- A framework for evaluating function calls made by LLMs☆34Updated last month
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆68Updated last week
- Open Source Text Embedding Models with OpenAI Compatible API☆124Updated 2 months ago
- Completion After Prompt Probability. Make your LLM make a choice☆68Updated last week
- llama.cpp to PyTorch Converter☆21Updated 5 months ago
- This repository presents the original implementation of LumberChunker: Long-Form Narrative Document Segmentation by André V. Duarte, João…☆27Updated 2 weeks ago
- ☆13Updated 8 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆58Updated 2 weeks ago
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆73Updated 6 months ago
- Fine-Tuning LLM and embedding models☆27Updated last year
- An ONNX converter script focused on embedding models☆24Updated 7 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆67Updated 2 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆31Updated 2 weeks ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 6 months ago
- ☆71Updated 3 months ago
- ☆21Updated 5 months ago
- Universal text classifier for generative models☆19Updated last month