sylvain-wei / TIMELinks
[NeurIPS 2025 D&B (Spotlightπ)] TIME: A Multi-level Benchmark for Temporal Reasoning of LLMs in Real-World Scenario
β29Updated 3 months ago
Alternatives and similar repositories for TIME
Users that are interested in TIME are comparing it to the libraries listed below
Sorting:
- This is the repository of DEER, a Dynamic Early Exit in Reasoning method for Large Reasoning Language Models.β179Updated 6 months ago
- Official repository for paper: O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruningβ97Updated 11 months ago
- The official implementation of the paper "Mem-Ξ±: Learning Memory Construction via Reinforcement Learning"β149Updated last month
- Resources and paper list for 'Scaling Environments for Agents'. This repository accompanies our survey on how environments contribute to β¦β57Updated this week
- β16Updated 3 months ago
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!β72Updated 9 months ago
- β62Updated 6 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuningβ89Updated 11 months ago
- [ICLR 2025] Code and Data Repo for Paper "Latent Space Chain-of-Embedding Enables Output-free LLM Self-Evaluation"β93Updated last year
- β177Updated last month
- [NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Modelβ64Updated 3 months ago
- [NeurIPS 2025@FoRLM] R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Searchβ17Updated last week
- [ICML'25] Our study systematically investigates massive values in LLMs' attention mechanisms. First, we observe massive values are concenβ¦β85Updated 7 months ago
- [COLM'25] Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?β36Updated 7 months ago
- MemGen: Weaving Generative Latent Memory for Self-Evolving Agentsβ290Updated 2 months ago
- Code, benchmark and environment for "ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows"β120Updated 2 months ago
- Official repository for ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool Useβ27Updated 2 months ago
- [ACL' 25] The official code repository for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models.β87Updated 11 months ago
- β43Updated last month
- β20Updated 3 months ago
- π This is a repository for organizing papers, codes, and other resources related to Latent Reasoning.β348Updated 2 months ago
- [FSE'2026] SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarksβ138Updated this week
- π A curated list of awesome resources focusing on Context Compression techniques for Large Language Models(LLMs).β53Updated 2 weeks ago
- Official Implementation of "ToolSafe: Enhancing Tool Invocation Safety of LLM-based Agents via Proactive Step-level Guardrail and Feedbacβ¦β26Updated last week
- π§Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learningβ312Updated 3 weeks ago
- A Unified Framework for High-Performance and Extensible LLM Steeringβ158Updated this week
- Official Repository of LatentSeekβ76Updated 7 months ago
- [EMNLP 2025] LightThinker: Thinking Step-by-Step Compressionβ131Updated 9 months ago
- π Paper list on decoding methods for LLMs and LVLMsβ68Updated 2 months ago
- [NeurIPS 2025] The implementation of paper "On Reasoning Strength Planning in Large Reasoning Models"β29Updated 6 months ago