bigai-ai / ICELinks
γICLR 2025 π₯γThe code for Consistent In-Context Editing, an approach for tuning language models through contextual distributions, overcoming the limitations of traditional fine-tuning methods that learn towards one-hot targets.
β44Updated 3 months ago
Alternatives and similar repositories for ICE
Users that are interested in ICE are comparing it to the libraries listed below
Sorting:
- β147Updated 2 months ago
- β46Updated 3 months ago
- [NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.β124Updated 3 months ago
- Paper List of Inference/Test Time Scaling/Computingβ280Updated 2 weeks ago
- [2025-TMLR] A Survey on the Honesty of Large Language Modelsβ58Updated 7 months ago
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoningβ61Updated this week
- ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-ofβ¦β31Updated last month
- [ACL 2025] A Neural-Symbolic Self-Training Frameworkβ109Updated last month
- β242Updated last week
- Extrapolating RLVR to General Domains without Verifiersβ112Updated 2 weeks ago
- RM-R1: Unleashing the Reasoning Potential of Reward Modelsβ113Updated 3 weeks ago
- The Entropy Mechanism of Reinforcement Learning for Large Language Model Reasoning.β251Updated this week
- β318Updated last month
- Official Repository of LatentSeekβ51Updated last month
- β113Updated 4 months ago
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.β134Updated this week
- A comprehensive collection of process reward models.β95Updated 3 weeks ago
- Laser: Learn to Reason Efficiently with Adaptive Length-based Reward Shapingβ49Updated last month
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correctβ180Updated 6 months ago
- One-shot Entropy Minimizationβ167Updated last month
- Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"β77Updated last week
- CoT-Valve: Length-Compressible Chain-of-Thought Tuningβ76Updated 5 months ago
- Large Language Models Can Self-Improve in Long-context Reasoningβ71Updated 7 months ago
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.β74Updated 5 months ago
- NoisyRollout: Reinforcing Visual Reasoning with Data Augmentationβ78Updated last month
- Github repository for "Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging" (ICML 2025)β63Updated last month
- [ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Styleβ56Updated this week
- [ICML 2024] Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibratiβ¦β40Updated last year
- This is the repo for our paper "Mr-Ben: A Comprehensive Meta-Reasoning Benchmark for Large Language Models"β50Updated 8 months ago
- [NeurIPS 2024] Code for the paper "Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models"β169Updated 4 months ago