Train and Infer Powerful Sentence Embeddings with AnglE | π₯ SOTA on STS and MTEB Leaderboard
β567Mar 22, 2026Updated last month
Alternatives and similar repositories for AnglE
Users that are interested in AnglE are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- RAN: Recurrent Attention Networks for Long-text Modeling | Findings of ACL23β23Aug 12, 2023Updated 2 years ago
- Tool for converting LLMs from uni-directional to bi-directional by removing causal mask for tasks like classification and sentence embeddβ¦β65Dec 12, 2024Updated last year
- Generative Representational Instruction Tuningβ691Jun 25, 2025Updated 10 months ago
- Codebase for RetroMAE and beyond.β273Jun 7, 2024Updated last year
- Retrieval and Retrieval-augmented LLMsβ11,642Apr 22, 2026Updated last week
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Official Code for Merging Statistical Feature via Adaptive Gate for Improved Text Classification (AAAI2021)β26Feb 5, 2022Updated 4 years ago
- text embeddingβ146Sep 18, 2023Updated 2 years ago
- Scaling Sentence Embeddings with Large Language Modelsβ108Mar 22, 2024Updated 2 years ago
- A Simple but Powerful SOTA NER Model | Official Code For Label Supervised LLaMA Finetuningβ152Mar 17, 2024Updated 2 years ago
- Train Models Contrastively in Pytorchβ787Mar 26, 2025Updated last year
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, impβ¦β210Aug 31, 2024Updated last year
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddingsβ45Mar 6, 2024Updated 2 years ago
- WIP: Ofen is a toolkit aimed at making transformer models production-ready. API includedβ17Oct 2, 2024Updated last year
- A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.β2,168Oct 16, 2025Updated 6 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer β’ AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'β1,686Apr 4, 2026Updated last month
- T2Ranking: A large-scale Chinese benchmark for passage ranking.β162Jul 3, 2023Updated 2 years ago
- MTEB: Massive Text Embedding Benchmarkβ3,247Updated this week
- Late Interaction Models Training & Retrievalβ796Updated this week
- Official Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP 2021)β41Jul 27, 2024Updated last year
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.β1,612Dec 20, 2025Updated 4 months ago
- RankLLM is a Python toolkit for reproducible information retrieval research using rerankers, with a focus on listwise reranking.β592Updated this week
- β165Apr 17, 2024Updated 2 years ago
- Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"β562Dec 28, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI β’ AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- β62Jul 21, 2024Updated last year
- Fast BM25 search in Python, powered by Numpy and Numbaβ1,648Updated this week
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching oβ¦β160Jul 14, 2025Updated 9 months ago
- Scalable training for dense retrieval models.β298Apr 8, 2026Updated 3 weeks ago
- Tevatron - Unified Document Retrieval Toolkit across Scale, Language, and Modality. Demo in SIGIR 2023, SIGIR 2025.β735Updated this week
- Rank-DistiLLM: Closing the Effectiveness Gap Between Cross-Encoders and LLMs for Passage Re-Rankingβ25Apr 4, 2025Updated last year
- Code repository for the paper - "Matryoshka Representation Learning"β627Feb 19, 2024Updated 2 years ago
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructionsβ54Jul 3, 2024Updated last year
- [ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddingsβ2,022Jan 15, 2025Updated last year
- Managed Database hosting by DigitalOcean β’ AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.β207Jul 31, 2024Updated last year
- State-of-the-Art Text Embeddingsβ18,615Updated this week
- Crispy reranking models by Mixedbreadβ51Sep 17, 2025Updated 7 months ago
- Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster rβ¦β19Mar 23, 2024Updated 2 years ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 laβ¦β49Nov 13, 2023Updated 2 years ago
- Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpaliβ2,782Mar 24, 2026Updated last month
- ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)β3,849Oct 14, 2025Updated 6 months ago