☆32Aug 11, 2025Updated 6 months ago
Alternatives and similar repositories for HBPO
Users that are interested in HBPO are comparing it to the libraries listed below
Sorting:
- ☆36Oct 9, 2025Updated 4 months ago
- [NeurIPS 2025] Let LRMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604☆55Nov 4, 2025Updated 4 months ago
- Benchmarking agent reasoning capabilities in physical interactions, tool usage, and multi-agent coordination.☆42Aug 10, 2025Updated 6 months ago
- [NeurIPS 2025] Mind the Gap: Bridging Thought Leap for Improved CoT Tuning https://arxiv.org/abs/2505.14684☆45Oct 20, 2025Updated 4 months ago
- GSM8K-V: Can Vision Language Models Solve Grade School Math Word Problems in Visual Contexts☆39Sep 30, 2025Updated 5 months ago
- [AAAI 2026] Test-Time Reinforcement Learning for GUI Grounding via Region Consistency https://arxiv.org/abs/2508.05615☆61Nov 8, 2025Updated 3 months ago
- ☆25Aug 19, 2025Updated 6 months ago
- [ICLR 2026] InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models☆47Feb 12, 2026Updated 3 weeks ago
- This repository is the official implementation of TimeHC-RL (Distilabel (Data Generation) + TRL (SFT) + VeRL (GRPO)).☆48Jun 4, 2025Updated 9 months ago
- [ACM MM 2025] SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation. https://arxiv.org/abs/2506.03139☆75Nov 10, 2025Updated 3 months ago
- An extention to the GaLore paper, to perform Natural Gradient Descent in low rank subspace☆18Oct 21, 2024Updated last year
- A Unified Framework for High-Performance and Extensible LLM Steering☆179Updated this week
- Control LLM☆22Apr 6, 2025Updated 11 months ago
- ☆30Aug 27, 2024Updated last year
- [NeurIPS 2024] Low rank memory efficient optimizer without SVD☆33Jul 1, 2025Updated 8 months ago
- Code for the 2025 ACL publication "Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs"☆32Jun 25, 2025Updated 8 months ago
- ☆59Nov 17, 2025Updated 3 months ago
- ☆10Sep 7, 2019Updated 6 years ago
- Repository of IPBench☆19Jan 4, 2026Updated 2 months ago
- ☆35Mar 12, 2025Updated 11 months ago
- ☆10Sep 29, 2024Updated last year
- This is the code of a agentic rag method with dynamic workflow.☆12Jan 22, 2026Updated last month
- An Advanced Basic Math Reasoning and Overthinking Evaluation Framework for LLMs☆12Jul 8, 2025Updated 7 months ago
- ☆11Jul 17, 2023Updated 2 years ago
- [NeurIPS 2025] Official code for "Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms"☆23Oct 23, 2025Updated 4 months ago
- A curated collection of resources, tools, and frameworks for developing GUI Agents.☆311Updated this week
- Reinforced Multi-LLM Agents training☆73Jan 18, 2026Updated last month
- EA-HAS-Bench: Energy-Aware Hyperparameter and Architecture Search Benchmark (ICLR Spotlight 2023)☆18Dec 8, 2024Updated last year
- 中文转emoji☆11Dec 17, 2018Updated 7 years ago
- ☆11Jun 12, 2024Updated last year
- Code for the paper "FinRLlama: A Solution to LLM-Engineered Signals Challenge at FinRL Contest 2024"☆13Feb 14, 2025Updated last year
- Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numba☆36Oct 16, 2025Updated 4 months ago
- ☆16May 16, 2025Updated 9 months ago
- Transformer + GAT for RNA chemical reactivity prediction| Stanford Ribonanza☆11Jan 28, 2026Updated last month
- An LLM inference engine, written in C++☆18Feb 5, 2026Updated last month
- [NeurIPS 2025@FoRLM] R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search☆17Jan 24, 2026Updated last month
- opentqa is a open framework of the textbook question answering, which includes xtqa, mcan, cmr, mfb, mutan.☆11Mar 27, 2021Updated 4 years ago
- [ICML 2025 Spotlight] RAPID: Long-Context Inference with Retrieval-Augmented Speculative Decoding☆19Mar 2, 2025Updated last year
- Encoder-decoders for translating different chemical formats.☆18Sep 17, 2025Updated 5 months ago