Systematic evaluation framework that automatically rates overthinking behavior in large language models.
☆100May 16, 2025Updated 11 months ago
Alternatives and similar repositories for ThinkingAgent
Users that are interested in ThinkingAgent are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆18Apr 11, 2025Updated last year
- [ToN 2023 && IWQoS 2021] FlexNF: Flexible Network Function Orchestration for Scalable On-Path Service Chain Serving☆19Jun 8, 2024Updated last year
- [NeurIPS 2025] CodeCrash: Exposing LLM Fragility to Misleading Natural Language in Code Reasoning☆17Jan 24, 2026Updated 3 months ago
- UQ: Assessing Language Models on Unsolved Questions☆30Aug 26, 2025Updated 8 months ago
- ☆63May 13, 2025Updated 11 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Elastic computing platform☆31Apr 22, 2026Updated last week
- The original Shared Recurrent Memory Transformer implementation☆35Jul 11, 2025Updated 9 months ago
- The source code of QueryAttack.☆27Feb 23, 2025Updated last year
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆102Oct 23, 2024Updated last year
- ☆96Dec 6, 2024Updated last year
- Nexusflow function call, tool use, and agent benchmarks.☆30Dec 13, 2024Updated last year
- ☆118Nov 7, 2024Updated last year
- [NAACL'25 🏆 SAC Award] Official code for "Advancing MoE Efficiency: A Collaboration-Constrained Routing (C2R) Strategy for Better Expert…☆16Feb 4, 2025Updated last year
- ☆58Sep 2, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Examples of Verbalized Machine Learning (VML)☆16Mar 16, 2025Updated last year
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆17Mar 31, 2025Updated last year
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 9 months ago
- ☆30Jan 31, 2026Updated 3 months ago
- Alice in Wonderland code base for experiments and raw experiments data☆131Feb 4, 2026Updated 2 months ago
- [NeurIPS 2025] GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer☆26Mar 20, 2026Updated last month
- ☆105Dec 6, 2024Updated last year
- FastCuRL: Curriculum Reinforcement Learning with Stage-wise Context Scaling for Efficient LLM Reasoning (EMNLP 2025)☆58Oct 10, 2025Updated 6 months ago
- The code implementation for TTCS: Test-Time Curriculum Synthesis for Self-Evolving.☆45Apr 22, 2026Updated last week
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Automatic evals for LLMs☆591Feb 24, 2026Updated 2 months ago
- [ICLR 2026] PSFT is a trust-region–inspired fine-tuning objective that views SFT as a policy gradient method with constant advantages, co…☆38Sep 9, 2025Updated 7 months ago
- Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"☆189May 20, 2025Updated 11 months ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆149Oct 27, 2024Updated last year
- ☆43May 29, 2025Updated 11 months ago
- LCA-on-the-line (ICML 2024 Oral)☆14Feb 13, 2025Updated last year
- Official Repository for Paper "BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Compet…☆18Aug 28, 2024Updated last year
- ☆23Dec 17, 2024Updated last year
- ☆16Feb 22, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization☆42Feb 7, 2026Updated 2 months ago
- Multimodal Large Language Models for Code Generation under Multimodal Scenarios☆236Updated this week
- Code and data for the EMNLP 2021 paper "Just Say No: Analyzing the Stance of Neural Dialogue Generation in Offensive Contexts". Coming so…☆17Jul 27, 2023Updated 2 years ago
- ☆22Oct 22, 2024Updated last year
- ☆14Apr 14, 2025Updated last year
- [ACL 2025 Main] Official Repository for "Evaluating Language Models as Synthetic Data Generators"☆41Dec 13, 2024Updated last year
- ☆137May 8, 2025Updated 11 months ago