☆32Aug 11, 2025Updated 8 months ago
Alternatives and similar repositories for HBPO
Users that are interested in HBPO are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆37Oct 9, 2025Updated 6 months ago
- [NeurIPS 2025] Let LRMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604☆55Nov 4, 2025Updated 5 months ago
- [NeurIPS 2025] Mind the Gap: Bridging Thought Leap for Improved CoT Tuning https://arxiv.org/abs/2505.14684☆47Oct 20, 2025Updated 5 months ago
- [AAAI 2026] Test-Time Reinforcement Learning for GUI Grounding via Region Consistency https://arxiv.org/abs/2508.05615☆63Nov 8, 2025Updated 5 months ago
- GSM8K-V: Can Vision Language Models Solve Grade School Math Word Problems in Visual Contexts☆40Sep 30, 2025Updated 6 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ViewSpatial-Bench:Evaluating Multi-perspective Spatial Localization in Vision-Language Models☆72Mar 9, 2026Updated last month
- Official code for "KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation"☆51Updated this week
- [ICLR 2026] InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models☆51Feb 12, 2026Updated 2 months ago
- This repository is the official implementation of TimeHC-RL (Distilabel (Data Generation) + TRL (SFT) + VeRL (GRPO)).☆48Jun 4, 2025Updated 10 months ago
- [ACM MM 2025] SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation. https://arxiv.org/abs/2506.03139☆78Nov 10, 2025Updated 5 months ago
- [AAAI 2026] GUI-G²: Gaussian Reward Modeling for GUI Grounding☆305Feb 2, 2026Updated 2 months ago
- A curated collection of resources, tools, and frameworks for developing GUI Agents.☆394Apr 8, 2026Updated last week
- The officalimplement of dLLM-Factory☆25Jul 12, 2025Updated 9 months ago
- 基于DPO算法微调语言大模型,简单好上手。☆51Jul 3, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Official code for "SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization"☆182Apr 7, 2026Updated last week
- An extention to the GaLore paper, to perform Natural Gradient Descent in low rank subspace☆18Oct 21, 2024Updated last year
- A deep research framework☆27Feb 3, 2026Updated 2 months ago
- code for GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation☆18Dec 7, 2024Updated last year
- Control LLM☆22Apr 6, 2025Updated last year
- 哈工大《数据库系统》2018年春季课程实验☆11Jun 10, 2018Updated 7 years ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆19Mar 10, 2025Updated last year
- Code of "Regularized Best-of-N Sampling with Minimum Bayes Risk Objective for Language Model Alignment" (2025).☆14Apr 4, 2025Updated last year
- Happily_Do_USTB大物实验☆23Aug 3, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A client-only OpenAI LLM Playground for prototyping agents without writing any code.☆22Aug 31, 2023Updated 2 years ago
- [CVPR 2026] Variation-aware Vision Token Dropping for Faster Large Vision-Language Models☆29Mar 18, 2026Updated 3 weeks ago
- 美赛爬虫,美国大学生数学建模竞赛证书爬取及信息OCR识别分析☆16Jun 25, 2022Updated 3 years ago
- Awesome-Parallel-Reasoning: Unlocking the reasoning potential of LLMs. Papers, Code, Resources & Survey.☆51Mar 8, 2026Updated last month
- ViLoMem: Agentic Learner with Grow-and-Refine Multimodal Semantic Memory☆59Nov 27, 2025Updated 4 months ago
- ☆31Aug 27, 2024Updated last year
- Benchmarking Autonomous Mobile Agents in Agent-User Interactive and MCP-Augmented Environments☆176Updated this week
- MDPO: Overcoming the Training-Inference Divide of Masked Diffusion Language Models☆42Jan 28, 2026Updated 2 months ago
- When Reasoning Meets Its Laws☆36Jan 2, 2026Updated 3 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Official code implementation for the ACL 2025 paper: 'Dynamic Scaling of Unit Tests for Code Reward Modeling'☆27May 16, 2025Updated 10 months ago
- [ICLR/AAAI 2026] Open-Source LLM-Based Data Analysis Agents☆79Jan 26, 2026Updated 2 months ago
- 计算机视觉课程设计-基于Chinese-CLIP的图文检索系统☆103Jun 20, 2023Updated 2 years ago
- TransE in Pytorch☆18Jul 31, 2019Updated 6 years ago
- Ring-V2 is a reasoning MoE LLM provided and open-sourced by InclusionAI.☆97Oct 23, 2025Updated 5 months ago
- diffusion models tutorials☆15Aug 19, 2025Updated 7 months ago
- ☆44Mar 31, 2026Updated 2 weeks ago