Starbucks: Improved Training for 2D Matryoshka Embeddings
☆22Jun 30, 2025Updated 8 months ago
Alternatives and similar repositories for Starbucks
Users that are interested in Starbucks are comparing it to the libraries listed below
Sorting:
- Rank-DistiLLM: Closing the Effectiveness Gap Between Cross-Encoders and LLMs for Passage Re-Ranking☆25Apr 4, 2025Updated 11 months ago
- ☆10Oct 2, 2024Updated last year
- ModernBERT model optimized for Apple Neural Engine.☆31Jan 10, 2025Updated last year
- Code for "Inducer-tuning: Connecting Prefix-tuning and Adapter-tuning" (EMNLP 2022) and "Empowering Parameter-Efficient Transfer Learning…☆11Feb 6, 2023Updated 3 years ago
- Effective Unsupervised Domain Adaptation of Neural Rankers by Diversifying Synthetic Query Generation☆15Apr 23, 2025Updated 10 months ago
- ☆53Oct 13, 2025Updated 4 months ago
- Use contrastive learning to train a large language model (LLM) as a retriever☆12Jul 19, 2024Updated last year
- The offcial repository for 'CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos', SIGI…☆16May 4, 2022Updated 3 years ago
- This repository helps you evaluate your models on the FreshStack benchmark!☆33Dec 9, 2025Updated 2 months ago
- 🌏 Modular retrievers for zero-shot multilingual IR.☆30Mar 6, 2024Updated last year
- EnriCo: Enriched Representation and Globally Constrained Inference for Entity and Relation Extraction☆26May 22, 2024Updated last year
- Code for paper "Prompt-Based Metric Learning for Few-shot NER".☆23Nov 14, 2023Updated 2 years ago
- An opensource TAR framework for experiments and applications☆18Mar 18, 2024Updated last year
- Small python package to measure OCR quality and other related metrics.☆27Feb 19, 2024Updated 2 years ago
- Official implementation of "GPT or BERT: why not both?"☆62Jul 28, 2025Updated 7 months ago
- Official Repository of Pretraining Without Attention (BiGS), BiGS is the first model to achieve BERT-level transfer learning on the GLUE …☆117Mar 16, 2024Updated last year
- SIGIR 2021: Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling☆60Jul 11, 2021Updated 4 years ago
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆78Feb 10, 2026Updated 3 weeks ago
- Pre-train Static Word Embeddings☆93Sep 9, 2025Updated 5 months ago
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆91Jan 9, 2026Updated last month
- A passion project on my favorite e-commerce site that scrapes product data and builds a recommendation engine☆10May 2, 2023Updated 2 years ago
- A large-scale information-rich web dataset, featuring millions of real clicked query-document labels☆346Dec 16, 2024Updated last year
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆112May 19, 2025Updated 9 months ago
- Token-free Language Modeling with ByGPT5 & Friends!☆12Jul 18, 2025Updated 7 months ago
- ☆16Updated this week
- The official implementation of the paper "Text Classification in the Wild: a Large-scale Long-tailed Name Normalization Dataset"(ICASSP 2…☆12Feb 19, 2023Updated 3 years ago
- ☆10May 1, 2025Updated 10 months ago
- Learning materials for the Life In The UK test.☆13Mar 25, 2023Updated 2 years ago
- The PyTorch implementation of paper "KERMIT: Knowledge Graph Completion of Enhanced Relation Modeling with Inverse Transformation"☆15Jul 4, 2025Updated 8 months ago
- scrape web content into readable markdown for llms and human readers☆10Feb 19, 2024Updated 2 years ago
- Code Roberta version of RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder☆10Mar 16, 2023Updated 2 years ago
- A library for computing diverse text characteristics and using them to analyze data sets and models with ease.☆41Aug 18, 2022Updated 3 years ago
- A RAG that can scale 🧑🏻💻☆11May 28, 2024Updated last year
- Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference☆45Nov 28, 2022Updated 3 years ago
- ☆48Jan 21, 2024Updated 2 years ago
- Lightweight Nearest Neighbors with Flexible Backends☆333Dec 30, 2025Updated 2 months ago
- Combining encoder-based language models☆11Nov 11, 2021Updated 4 years ago
- Reading great papers in the history of artificial intelligence and machine learning☆10Oct 26, 2022Updated 3 years ago
- Implementation and results for ICTIR2021 paper: Effective and Privacy-preserving Federated Online Learning to Rank☆10Jul 24, 2021Updated 4 years ago