microsoft / aconLinks
Official implementation of paper "ACON: Optimizing Context Compression for Long-horizon LLM Agents"
☆40Updated 2 months ago
Alternatives and similar repositories for acon
Users that are interested in acon are comparing it to the libraries listed below
Sorting:
- Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).☆41Updated 2 months ago
- Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"☆55Updated last week
- ☆23Updated last year
- Official Code Repository for the paper "Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-intensive Tasks…☆42Updated last year
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆50Updated last year
- In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation (ICML 2024)☆62Updated last year
- ☆17Updated 5 months ago
- [NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models☆65Updated last year
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆43Updated 10 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆86Updated 9 months ago
- A comrephensive collection of learning from rewards in the post-training and test-time scaling of LLMs, with a focus on both reward model…☆60Updated 7 months ago
- The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"☆39Updated 2 years ago
- A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large …☆98Updated last year
- ☆22Updated 2 months ago
- [EMNLP 2024] Official implementation of "Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Ut…☆23Updated last year
- A curated list of resources on Reinforcement Learning with Verifiable Rewards (RLVR) and the reasoning capability boundary of Large Langu…☆86Updated last month
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆35Updated 10 months ago
- [arxiv: 2505.02156] Adaptive Thinking via Mode Policy Optimization for Social Language Agents☆46Updated 6 months ago
- JudgeLRM: Large Reasoning Models as a Judge☆40Updated last month
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style☆73Updated 5 months ago
- ☆24Updated 9 months ago
- [ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment (https://arxiv.org/abs/2410.02197)☆38Updated 4 months ago
- [NeurIPS 2024] A comprehensive benchmark for evaluating critique ability of LLMs☆48Updated last year
- Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to …☆53Updated 3 weeks ago
- RM-R1: Unleashing the Reasoning Potential of Reward Models☆156Updated 6 months ago
- [2025-TMLR] A Survey on the Honesty of Large Language Models☆63Updated last year
- PyTorch implementation of StableMask (ICML'24)☆15Updated last year
- A Sober Look at Language Model Reasoning☆92Updated last month
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compression☆127Updated 9 months ago
- ☆41Updated 4 months ago