andersonbcdefg / dpo-loraLinks
direct preference optimization with only 1 model copy :)
☆14Updated 2 years ago
Alternatives and similar repositories for dpo-lora
Users that are interested in dpo-lora are comparing it to the libraries listed below
Sorting:
- ☆74Updated last year
 - ☆51Updated 7 months ago
 - ☆61Updated last year
 - Public Inflection Benchmarks☆68Updated last year
 - Repository for the paper Stream of Search: Learning to Search in Language☆151Updated 9 months ago
 - Experiments for efforts to train a new and improved t5☆75Updated last year
 - Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆193Updated last year
 - ☆122Updated 8 months ago
 - ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward exp…☆224Updated last month
 - OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆172Updated 9 months ago
 - Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆90Updated last year
 - Functional Benchmarks and the Reasoning Gap☆89Updated last year
 - Just a bunch of benchmark logs for different LLMs☆118Updated last year
 - Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆232Updated 3 months ago
 - Replicating O1 inference-time scaling laws☆90Updated 11 months ago
 - Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆188Updated 7 months ago
 - ☆114Updated 2 weeks ago
 - ☆135Updated 7 months ago
 - Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆218Updated last week
 - Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆59Updated last year
 - Can Language Models Solve Olympiad Programming?☆119Updated 9 months ago
 - Evaluation of neuro-symbolic engines☆39Updated last year
 - ☆35Updated 5 months ago
 - Code repository for the c-BTM paper☆107Updated 2 years ago
 - A 7B parameter model for mathematical reasoning☆40Updated 8 months ago
 - ☆108Updated last year
 - Official repo for Learning to Reason for Long-Form Story Generation☆72Updated 6 months ago
 - Train your own SOTA deductive reasoning model☆109Updated 7 months ago
 - ☆195Updated 6 months ago
 - Open source interpretability artefacts for R1.☆163Updated 6 months ago