bentoml / sentence-embedding-bento
Sentence Embedding as a Service
☆15Updated last year
Alternatives and similar repositories for sentence-embedding-bento:
Users that are interested in sentence-embedding-bento are comparing it to the libraries listed below
- Cortex-compatible model server for Python and TensorFlow☆17Updated 2 years ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆34Updated 4 months ago
- A collection of reproducible inference engine benchmarks☆29Updated 2 weeks ago
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated last year
- Fast and versatile tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram and…☆22Updated last month
- Simple dependency injection framework for Python☆20Updated 11 months ago
- ANE accelerated embedding models!☆16Updated 4 months ago
- Rust bindings for CTranslate2☆14Updated last year
- ☆39Updated 2 years ago
- Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster r…☆17Updated last year
- The backend behind the LLM-Perf Leaderboard☆10Updated last year
- A clone of OpenAI's Tokenizer page for HuggingFace Models☆45Updated last year
- setup the env for vllm users☆16Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆30Updated 7 months ago
- Some microbenchmarks and design docs before commencement☆12Updated 4 years ago
- A text-to-SQL prototype on the northwind sqlite dataset☆12Updated 7 months ago
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.☆25Updated 2 years ago
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆34Updated last year
- A library for squeakily cleaning and filtering language datasets.☆47Updated last year
- 🤝 Trade any tensors over the network☆30Updated last year
- WIP. Veloce is a low-code Ray-based parallelization library that makes machine learning computation novel, efficient, and heterogeneous.☆18Updated 2 years ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated 5 months ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆43Updated last year
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- A file utility for accessing both local and remote files through a unified interface.☆41Updated 2 weeks ago
- MLFlow Deployment Plugin for Ray Serve☆44Updated 3 years ago
- Vector Database with support for late interaction and token level embeddings.☆54Updated 7 months ago
- Explore the use of DSPy for extracting features from PDFs 🔎☆39Updated last year
- Visualize expert firing frequencies across sentences in the Mixtral MoE model☆17Updated last year