RAIVNLab / MRL
Code repository for the paper - "Matryoshka Representation Learning"
☆457Updated last year
Alternatives and similar repositories for MRL:
Users that are interested in MRL are comparing it to the libraries listed below
- Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint☆374Updated 10 months ago
- Generative Representational Instruction Tuning☆596Updated last month
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆255Updated last year
- Scaling Data-Constrained Language Models☆333Updated 4 months ago
- Contriever: Unsupervised Dense Information Retrieval with Contrastive Learning☆711Updated last year
- Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch☆306Updated 8 months ago
- Repo for "Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture"☆547Updated last month
- A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).☆803Updated last week
- ☆496Updated 3 months ago
- Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch☆501Updated 3 months ago
- Official repository for ORPO☆437Updated 8 months ago
- A large-scale information-rich web dataset, featuring millions of real clicked query-document labels☆313Updated 2 months ago
- A repository for research on medium sized language models.☆491Updated last month
- Helpful tools and examples for working with flex-attention☆635Updated this week
- DataComp: In search of the next generation of multimodal datasets☆679Updated last year
- Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch☆225Updated 5 months ago
- Train Models Contrastively in Pytorch☆639Updated this week
- Large Context Attention☆682Updated 3 weeks ago
- Implementation of Recurrent Memory Transformer, Neurips 2022 paper, in Pytorch☆405Updated last month
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆451Updated 11 months ago
- Annotated version of the Mamba paper☆473Updated 11 months ago
- [ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning☆584Updated 11 months ago
- DSIR large-scale data selection framework for language model training☆241Updated 10 months ago
- Official PyTorch implementation of QA-LoRA☆126Updated 11 months ago
- Code for T-Few from "Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning"☆443Updated last year
- Recurrent Memory Transformer☆149Updated last year
- batched loras☆338Updated last year
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆389Updated 9 months ago
- ☆299Updated 7 months ago
- Official implementation of "Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling"☆841Updated this week