ShuoTang123 / MATRIXLinks
Implementation of the MATRIX framework (ICML 2024)
☆54Updated last year
Alternatives and similar repositories for MATRIX
Users that are interested in MATRIX are comparing it to the libraries listed below
Sorting:
- ☆33Updated 8 months ago
- ☆43Updated 8 months ago
- Pytorch implementation of Tree Preference Optimization (TPO) (Accepyed by ICLR'25)☆19Updated 2 months ago
- ☆139Updated last month
- ☆45Updated 4 months ago
- [ACL'24] Chain of Thought (CoT) is significant in improving the reasoning abilities of large language models (LLMs). However, the correla…☆45Updated last month
- Code for "CREAM: Consistency Regularized Self-Rewarding Language Models", ICLR 2025.☆22Updated 4 months ago
- Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models☆14Updated last year
- Official repository for 'Safety Challenges in Large Reasoning Models: A Survey' - Exploring safety risks, attacks, and defenses for Large…☆44Updated 2 weeks ago
- A Survey on the Honesty of Large Language Models☆57Updated 6 months ago
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆177Updated 5 months ago
- ☆26Updated last year
- The repository of the paper "REEF: Representation Encoding Fingerprints for Large Language Models," aims to protect the IP of open-source…☆44Updated 5 months ago
- A Framework for LLM-based Multi-Agent Reinforced Training and Inference☆136Updated last week
- ☆46Updated 7 months ago
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.☆191Updated last week
- This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"☆65Updated 2 months ago
- ☆127Updated 2 weeks ago
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆70Updated 7 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆73Updated 4 months ago
- ☆64Updated 3 weeks ago
- [ACL'24, Outstanding Paper] Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!☆37Updated 10 months ago
- Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".☆75Updated 3 weeks ago
- Accepted LLM Papers in NeurIPS 2024☆37Updated 8 months ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆144Updated 7 months ago
- Official implementation for "ALI-Agent: Assessing LLMs'Alignment with Human Values via Agent-based Evaluation"☆18Updated last month
- ☆9Updated 8 months ago
- Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"☆72Updated 2 weeks ago
- [NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$☆45Updated 8 months ago
- [ICLR 2025] Official codebase for the ICLR 2025 paper "Multimodal Situational Safety"☆18Updated this week