Extensive Self-Contrast Enables Feedback-Free Language Model Alignment
☆21Apr 2, 2024Updated last year
Alternatives and similar repositories for Self-Contrast
Users that are interested in Self-Contrast are comparing it to the libraries listed below
Sorting:
- ☆13Jul 2, 2025Updated 8 months ago
- Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment☆69Aug 18, 2023Updated 2 years ago
- Dateset Reset Policy Optimization☆31Apr 12, 2024Updated last year
- APAR: LLMs Can Do Auto-Parallel Auto-Regressive Decoding☆14Jul 22, 2024Updated last year
- [ICLR 2026] Official repo for "Spotlight on Token Perception for Multimodal Reinforcement Learning"☆51Jan 30, 2026Updated last month
- ☆12Dec 16, 2025Updated 3 months ago
- a benchmark to evaluate the situated inductive reasoning☆15Jan 7, 2025Updated last year
- Aligning Agentic World Models via Knowledgeable Experience Learning☆32Jan 25, 2026Updated last month
- ☆15Jun 11, 2024Updated last year
- ☆10Mar 3, 2026Updated 2 weeks ago
- Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models" [IEEE ICASSP 202…☆32Mar 10, 2026Updated last week
- PyTorch unoffical implementation of "PoE-GAN : Multimodal Conditional Image Synthesis with Product-of-Experts GANs"☆14Mar 29, 2023Updated 2 years ago
- The official repository of "SmartAgent: Chain-of-User-Thought for Embodied Personalized Agent in Cyber World".☆27Aug 20, 2025Updated 7 months ago
- NaturalCodeBench (Findings of ACL 2024)☆68Oct 14, 2024Updated last year
- ☆21Dec 14, 2024Updated last year
- ☆46Jun 11, 2025Updated 9 months ago
- CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning☆35Aug 28, 2025Updated 6 months ago
- Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…☆53Jun 24, 2024Updated last year
- In-BoXBART: Get Instructions into Biomedical Multi-task Learning☆14Aug 23, 2022Updated 3 years ago
- Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".☆27Aug 9, 2025Updated 7 months ago
- A Wordle game written in Rust, refined. Play in browser with the power of WebAssembly! Course project of Programming Training, Tsinghua U…☆17Jul 10, 2024Updated last year
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"☆75May 20, 2025Updated 10 months ago
- Optimizing Anytime Reasoning via Budget Relative Policy Optimization☆52Jul 15, 2025Updated 8 months ago
- [ICML'25] "Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding" by Jiajun Zhu, Peihao Wang, Ruisi…☆14Jun 6, 2025Updated 9 months ago
- AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence☆10Mar 2, 2025Updated last year
- ☆23Sep 19, 2024Updated last year
- This is AlpaGasus2-QLoRA based on LLaMA2 with AlpaGasus mechanism using QLoRA!☆15Nov 22, 2023Updated 2 years ago
- This is the official repository for the "Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP" paper acce…☆25Feb 16, 2026Updated last month
- An experimental modular OS written in Rust.☆17Feb 11, 2025Updated last year
- Very concise example of integrated gradients (a method to reveal areas of attention in input images)☆10Jun 17, 2019Updated 6 years ago
- RewardAnything: Generalizable Principle-Following Reward Models☆45Jun 11, 2025Updated 9 months ago
- MTEB: Massive Text Embedding Benchmark☆11Jan 29, 2024Updated 2 years ago
- [𝐄𝐌𝐍𝐋𝐏 𝐅𝐢𝐧𝐝𝐢𝐧𝐠𝐬 𝟐𝟎𝟐𝟒 & 𝐀𝐂𝐋 𝟐𝟎𝟐𝟒 𝐍𝐋𝐑𝐒𝐄 𝐎𝐫𝐚𝐥] 𝘌𝘯𝘩𝘢𝘯𝘤𝘪𝘯𝘨 𝘔𝘢𝘵𝘩𝘦𝘮𝘢𝘵𝘪𝘤𝘢𝘭 𝘙𝘦𝘢𝘴𝘰𝘯𝘪𝘯…☆51May 4, 2024Updated last year
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- PICABench: How Far Are We from Physically Realistic Image Editing?☆36Nov 5, 2025Updated 4 months ago
- Labels issues using OpenAI's Classification API powered by GPT-3 models!☆19Apr 6, 2023Updated 2 years ago
- [EMNLP 2024] Tree of Problems: Improving structured problem solving with compositionality☆19Mar 4, 2025Updated last year
- Repository for Skill Set Optimization☆14Jul 26, 2024Updated last year
- Demonstration of how to run multiple chains in Langchain Assyncronously☆12Jul 6, 2023Updated 2 years ago