Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆62Aug 30, 2024Updated last year
Alternatives and similar repositories for CLAIR_and_APO
Users that are interested in CLAIR_and_APO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆17Sep 1, 2024Updated last year
- ☆16Jul 23, 2024Updated last year
- Implementations of online merging optimizers proposed by Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment☆82Jun 19, 2024Updated last year
- Distributed multi-agent framework for event-driven, graph-based computation. Elixir/Python, NATS event streaming, modular operator/XCS ar…☆14Mar 25, 2026Updated 2 months ago
- ☆33Jan 6, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Code of "Regularized Best-of-N Sampling with Minimum Bayes Risk Objective for Language Model Alignment" (2025).☆14Apr 4, 2025Updated last year
- Implementation for the paper "Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning"☆11Jan 10, 2025Updated last year
- Self-Supervised Alignment with Mutual Information☆20May 24, 2024Updated 2 years ago
- ☆10Oct 24, 2024Updated last year
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆47Apr 15, 2025Updated last year
- An advanced distributed knowledge fabric for intelligent document processing, featuring multi-document agents, optimized query handling, …☆52Mar 25, 2026Updated 2 months ago
- PyTorch code for System-1.x: Learning to Balance Fast and Slow Planning with Language Models☆25Jul 22, 2024Updated last year
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆32Apr 20, 2024Updated 2 years ago
- ☆131Oct 1, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- [ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning☆54Jul 28, 2024Updated last year
- Official implementation for "Law of the Weakest Link: Cross capabilities of Large Language Models"☆43Oct 1, 2024Updated last year
- Materials for "Quantifying the Plausibility of Context Reliance in Neural Machine Translation" at ICLR'24 🐑 🐑☆16Apr 18, 2024Updated 2 years ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- ☆55Apr 18, 2026Updated last month
- https://liuzeming01.github.io/XDailyDialog/☆15Jun 25, 2023Updated 2 years ago
- ☆13Jun 4, 2024Updated last year
- LLM-Training-API: Including Embeddings & ReRankers, mergekit, LaserRMT☆27Feb 18, 2024Updated 2 years ago
- Official repository for ORPO☆482May 31, 2024Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Benchmarking Benchmark Leakage in Large Language Models☆60May 20, 2024Updated 2 years ago
- Reactive DDD with DSPy☆23Feb 24, 2024Updated 2 years ago
- KoCommonGEN v2: A Benchmark for Navigating Korean Commonsense Reasoning Challenges in Large Language Models☆25Aug 24, 2024Updated last year
- Structured outputs from DSPy and Jinja2☆27Jun 27, 2025Updated 10 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆19Jul 24, 2025Updated 10 months ago
- SIFT: Grounding LLM Reasoning in Contexts via Stickers☆57Mar 6, 2025Updated last year
- ☆24Jan 28, 2025Updated last year
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆257Oct 30, 2024Updated last year
- [ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)☆75May 13, 2026Updated last week
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Training and Benchmarking LLMs for Code Preference.☆38Nov 15, 2024Updated last year
- Repo for: When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment☆38Jun 5, 2023Updated 2 years ago
- RS-IMLE☆44Dec 7, 2024Updated last year
- Official repo of Respond-and-Respond: data, code, and evaluation☆104Aug 2, 2024Updated last year
- TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language Models☆18Jan 2, 2025Updated last year
- DSPy Experiments☆10Aug 28, 2025Updated 8 months ago
- ☆23Dec 17, 2024Updated last year