embeddings-benchmark / arena
Code for the MTEB Arena
☆18Updated 4 months ago
Alternatives and similar repositories for arena:
Users that are interested in arena are comparing it to the libraries listed below
- minimal pytorch implementation of bm25 (with sparse tensors)☆97Updated 10 months ago
- Code for Zero-Shot Tokenizer Transfer☆121Updated 2 weeks ago
- ☆66Updated last month
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆124Updated 10 months ago
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆78Updated this week
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆173Updated 3 weeks ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- ☆38Updated 9 months ago
- Code for NeurIPS LLM Efficiency Challenge☆54Updated 9 months ago
- QLoRA with Enhanced Multi GPU Support☆36Updated last year
- ☆48Updated 2 months ago
- A RAG that can scale 🧑🏻💻☆11Updated 8 months ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated last year
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆66Updated 3 months ago
- 🚢 Data Toolkit for Sailor Language Models☆85Updated last month
- Using open source LLMs to build synthetic datasets for direct preference optimization☆52Updated 11 months ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆249Updated 6 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆53Updated 5 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆56Updated 6 months ago
- PyTorch implementation for MRL☆18Updated 11 months ago
- ☆116Updated 4 months ago
- ☆57Updated 4 months ago
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆79Updated 10 months ago
- ☆79Updated last month
- Fast, Modern, Memory Efficient, and Low Precision PyTorch Optimizers☆78Updated 6 months ago
- Minimum Bayes Risk Decoding for Hugging Face Transformers☆56Updated 7 months ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- Generalist and Lightweight Model for Text Classification☆59Updated last week
- Pretraining Efficiently on S2ORC!☆149Updated 3 months ago