IBM / larimarLinks
Code for ICML 2024 paper
☆26Updated last month
Alternatives and similar repositories for larimar
Users that are interested in larimar are comparing it to the libraries listed below
Sorting:
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆54Updated last year
- ☆53Updated last week
- Reinforcing General Reasoning without Verifiers☆60Updated 2 weeks ago
- ☆96Updated 9 months ago
- ☆65Updated last year
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆32Updated last year
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Reasoning with Minimal Examples☆95Updated 2 weeks ago
- Codebase for Instruction Following without Instruction Tuning☆34Updated 9 months ago
- Directional Preference Alignment☆57Updated 9 months ago
- Code for "Reasoning to Learn from Latent Thoughts"☆105Updated 2 months ago
- ☆13Updated 10 months ago
- ☆48Updated last month
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆138Updated 9 months ago
- Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging☆107Updated last year
- Reproduction of "RLCD Reinforcement Learning from Contrast Distillation for Language Model Alignment☆69Updated last year
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆43Updated last year
- Offical code of the paper Large Language Models Are Implicitly Topic Models: Explaining and Finding Good Demonstrations for In-Context Le…☆74Updated last year
- Long Context Extension and Generalization in LLMs☆57Updated 9 months ago
- [ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning☆45Updated 10 months ago
- Code and data used in the paper: "Training on Incorrect Synthetic Data via RL Scales LLM Math Reasoning Eight-Fold"☆30Updated last year
- Learning adapter weights from task descriptions☆19Updated last year
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"☆73Updated last month
- ☆97Updated 11 months ago
- Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)☆61Updated last year
- A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models☆57Updated 4 months ago
- ☆32Updated 5 months ago
- This is an official implementation of the paper ``Building Math Agents with Multi-Turn Iterative Preference Learning'' with multi-turn DP…☆25Updated 6 months ago
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆88Updated last month
- ☆40Updated 2 weeks ago
- Code for Paper: Learning Adaptive Parallel Reasoning with Language Models☆107Updated 2 months ago