frankxwang / dpo-prefix-sharing
DPO, but faster π
β20Updated 2 weeks ago
Related projects β
Alternatives and complementary repositories for dpo-prefix-sharing
- A repository for research on medium sized language models.β74Updated 5 months ago
- β26Updated 4 months ago
- Lottery Ticket Adaptationβ35Updated last month
- Official implementation of ECCV24 paper: POAβ24Updated 3 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.β33Updated 8 months ago
- β61Updated 2 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignmentβ46Updated 2 months ago
- Triton Implementation of HyperAttention Algorithmβ46Updated 11 months ago
- β20Updated last week
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTOβ¦β51Updated last week
- Collection of autoregressive model implementationβ66Updated last week
- Using FlexAttention to compute attention with different masking patternsβ40Updated last month
- GoldFinch and other hybrid transformer componentsβ39Updated 3 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"β36Updated 11 months ago
- β44Updated 2 months ago
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrievalβ24Updated last week
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"β91Updated last month
- Experimental scripts for researching data adaptive learning rate scheduling.β23Updated last year
- This repo is based on https://github.com/jiaweizzhao/GaLoreβ18Updated last month
- β12Updated 3 weeks ago
- β57Updated last month
- Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"β16Updated this week
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understandingβ37Updated 3 weeks ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response formatβ25Updated last year
- We introduce EMMET and unify model editing with popular algorithms ROME and MEMIT.β12Updated 2 months ago
- β38Updated this week
- β34Updated 8 months ago
- β15Updated 3 months ago
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,β¦β43Updated 3 months ago
- Understanding the correlation between different LLM benchmarksβ29Updated 10 months ago