janphilippfranken / sami
Self-Supervised Alignment with Mutual Information
☆14Updated 5 months ago
Related projects ⓘ
Alternatives and complementary repositories for sami
- Directional Preference Alignment☆49Updated last month
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆39Updated 3 months ago
- Code for LaMPP: Language Models as Probabilistic Priors for Perception and Action☆35Updated last year
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆62Updated 9 months ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs?☆23Updated 5 months ago
- ☆26Updated last year
- ☆24Updated 6 months ago
- This repository contains data, code and models for contextual noncompliance.☆18Updated 3 months ago
- ☆18Updated 2 months ago
- ICML 2024 - Official Repository for EXO: Towards Efficient Exact Optimization of Language Model Alignment☆46Updated 4 months ago
- Repository for Skill Set Optimization☆12Updated 3 months ago
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆44Updated 9 months ago
- Codebase for Instruction Following without Instruction Tuning☆29Updated last month
- ☆35Updated 9 months ago
- ☆18Updated 5 months ago
- Evaluate the Quality of Critique☆35Updated 5 months ago
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoning☆20Updated 8 months ago
- ☆15Updated 4 months ago
- DiffuGPT and DiffuLLaMA: Scaling Diffusion Language Models via Adaptation from Autoregressive Models☆55Updated 2 weeks ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆24Updated 6 months ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Updated 8 months ago
- Tasks for describing differences between text distributions.☆16Updated 3 months ago
- ☆15Updated 3 months ago
- ☆23Updated 3 months ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆45Updated 4 months ago
- [ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning☆30Updated 3 months ago
- ☆27Updated 8 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆32Updated 9 months ago
- [EMNLP-2022 Findings] Code for paper “ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback”.☆24Updated last year
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆69Updated last year