frankxwang / dpo-prefix-sharingView external linksLinks
DPO, but faster 🚀
☆47Dec 6, 2024Updated last year
Alternatives and similar repositories for dpo-prefix-sharing
Users that are interested in dpo-prefix-sharing are comparing it to the libraries listed below
Sorting:
- ☆20Nov 4, 2025Updated 3 months ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆28Apr 17, 2024Updated last year
- Simple repository for training small reasoning models☆49Feb 6, 2025Updated last year
- rabitq rust implementation☆10Feb 4, 2026Updated last week
- My hybrid TTS network that combines, VALL-E, VoiceBox, SpeechFlow, Seamless and TortoiseTTS into one☆26Aug 5, 2024Updated last year
- ☆16Jul 29, 2025Updated 6 months ago
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆17Nov 4, 2025Updated 3 months ago
- See https://github.com/cuda-mode/triton-index/ instead!☆11May 8, 2024Updated last year
- ☆15May 15, 2021Updated 4 years ago
- ☆16Feb 6, 2024Updated 2 years ago
- Codebase for Efficient yet simple Reinforcement Learning Research Framework☆28Jan 14, 2023Updated 3 years ago
- Code for the paper: Comparing and Contrasting Deep Learning Weather Prediction Backbones on Navier-Stokes and Atmospheric Dynamics☆13Aug 9, 2024Updated last year
- Mixture-of-Basis-Experts for Compressing MoE-based LLMs☆27Dec 24, 2025Updated last month
- [ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"☆49Jan 30, 2026Updated 2 weeks ago
- Blazingly fast neighborhood attention☆14Nov 28, 2023Updated 2 years ago
- Codes for our paper "AgentMonitor: A Plug-and-Play Framework for Predictive and Secure Multi-Agent Systems"☆13Dec 13, 2024Updated last year
- Automating Sub-Agent Creation for Agentic Orchestration☆30Updated this week
- [ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"☆40Mar 31, 2025Updated 10 months ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]☆32Jan 23, 2025Updated last year
- The Code and Script of "David's Slingshot: A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis"☆35Jun 13, 2025Updated 8 months ago
- [ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion☆14Mar 17, 2025Updated 11 months ago
- CoV: Chain-of-View Prompting for Spatial Reasoning☆50Jan 23, 2026Updated 3 weeks ago
- Gemstones: A Model Suite for Multi-Faceted Scaling Laws (NeurIPS 2025)☆33Sep 28, 2025Updated 4 months ago
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Dec 19, 2024Updated last year
- This repository contains the joint use of CPO and SimPO method for better reference-free preference learning methods.☆56Aug 13, 2024Updated last year
- Official code for our paper, "LoRA-Pro: Are Low-Rank Adapters Properly Optimized? "☆143Apr 8, 2025Updated 10 months ago
- Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening☆69May 18, 2025Updated 8 months ago
- An End-to-End Model with Adaptive Filtering for Retrieval-Augmented Generation☆16Oct 27, 2024Updated last year
- Curriculum training of instruction-following LLMs with Unsloth☆14Dec 15, 2025Updated 2 months ago
- [NeurIPS 2025] HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation☆75Sep 19, 2025Updated 4 months ago
- Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders☆18May 23, 2025Updated 8 months ago
- Codebase for Math Neurosurgery: Isolating LLMs' Math Reasoning Abilities Using Only Forward Passes☆21Jun 15, 2025Updated 8 months ago
- NISQA - Non-Intrusive Speech Quality and TTS Naturalness Assessment☆16Apr 13, 2022Updated 3 years ago
- ☆90Oct 30, 2025Updated 3 months ago
- Official implementation of "GPT or BERT: why not both?"☆61Jul 28, 2025Updated 6 months ago
- Code and Data for "FaithfulRAG: Fact-Level Conflict Modeling for Context-Faithful Retrieval-Augmented Generation" (ACL25)☆29Oct 26, 2025Updated 3 months ago
- Code for paper: Unified Text-to-Image Generation and Retrieval☆16Jul 6, 2024Updated last year
- ☆35Mar 12, 2025Updated 11 months ago