On Policy Distillation Build on top of Verl
☆69May 25, 2026Updated last week
Alternatives and similar repositories for OPSD_OnPolicyDistillation
Users that are interested in OPSD_OnPolicyDistillation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official repo for "TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders"☆25Apr 9, 2026Updated last month
- ☆17Apr 11, 2025Updated last year
- A Recipe for Building LLM Reasoners to Solve Complex Instructions☆32Oct 9, 2025Updated 7 months ago
- [ACL 2026]From Experience to Skill: Multi-Agent Generative Engine Optimization via Reusable Strategy Learning☆38Apr 26, 2026Updated last month
- We introduce Reasoning via Video, a new paradigm that uses maze-solving video generation to probe multimodal reasoning; our VR-Bench show…☆61Feb 4, 2026Updated 4 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Py Minutiae Viewer is a cross-platform, multi-format minutiae viewer.☆12Dec 5, 2019Updated 6 years ago
- Python numerical optimization toolbox☆12Nov 20, 2018Updated 7 years ago
- 主要是在高博的《视觉SLAM十四讲》提供的实践代码基础上,加入一些自己平时会用到的代码。☆16Mar 11, 2021Updated 5 years ago
- [ICML2026] ARLArena☆78May 2, 2026Updated last month
- ☆32Aug 11, 2025Updated 9 months ago
- ☆30Jul 22, 2024Updated last year
- TBD☆57Mar 13, 2026Updated 2 months ago
- Vero: An Open RL Recipe for General Visual Reasoning☆122Updated this week
- Implementation of the pictorial structures algorithm☆14Jul 23, 2014Updated 11 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Official Implementation of "Semantics-Consistent Feature Search for Self-Supervised Visual Representation Learning" in AAAI2024.☆13Feb 28, 2024Updated 2 years ago
- FakePartsBench: 25K+ AI-generated videos with pixel- and frame-level annotations of full and partial deepfakes.☆25May 29, 2026Updated last week
- ICML 2025 Spotlight, PCEvolve: Private Contrastive Evolution for Synthetic Dataset Generation via Few-Shot Private Data and Generative AP…☆14Jun 27, 2025Updated 11 months ago
- [AAAI 2026] Test-Time Reinforcement Learning for GUI Grounding via Region Consistency https://arxiv.org/abs/2508.05615☆65Nov 8, 2025Updated 7 months ago
- PyTorch Implementation for InMaP☆12Oct 28, 2023Updated 2 years ago
- ☆17Jun 10, 2025Updated 11 months ago
- [Arxiv 2025] Official code and datasets of paper: GNNs as Predictors of Agentic Workflow Performances☆20Jan 15, 2026Updated 4 months ago
- [ICML 2026] RLAnything & DemyAgent: General and scalable agentic RL algorithms across terminal, GUI, SWE, and tool-call settings☆535May 16, 2026Updated 3 weeks ago
- Code for "Contrast then Memorize: Semantic Neighbor Retrieval-Enhanced Inductive Multimodal Knowledge Graph Completion", SIGIR 2024.☆14Feb 20, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [NeurIPS 2024 Oral] "Bayesian-Guided Label Mapping for Visual Reprogramming"☆12Dec 20, 2024Updated last year
- Code of "Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model"☆13Jul 8, 2025Updated 11 months ago
- collab-dev - Collaboration Metrics for Code Reviews☆23May 12, 2025Updated last year
- Official repository for the paper "Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning" and the SciEvo benchmark.☆43Jan 13, 2026Updated 4 months ago
- LLM Reasoning Benchmark & Chain-of-Thoughts Dataset for Chemistry☆53Oct 9, 2025Updated 7 months ago
- [AAAI 2023] Pytorch Implementation for AAAI2023 paper: One-for-All: Proposal Masked Cross-Class Anomaly Detection☆15Oct 31, 2024Updated last year
- [NeurIPS 2025] Let LRMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604☆54Nov 4, 2025Updated 7 months ago
- [NeurIPS 2025] Mind the Gap: Bridging Thought Leap for Improved CoT Tuning https://arxiv.org/abs/2505.14684☆48Oct 20, 2025Updated 7 months ago
- Implementation of related angular-margin-based classification loss functions for training (face) embedding models: SphereFace, CosFace, A…☆26May 21, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official repository for the paper "Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation"☆208May 28, 2026Updated last week
- Custom ComfyUI node that combines VSR + VFI and allows streaming processing for arbitrary video length.☆66Mar 28, 2026Updated 2 months ago
- Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge☆121May 24, 2026Updated 2 weeks ago
- [ICLR 2026] The official repository for paper "ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning"☆186May 1, 2026Updated last month
- 在 Mirai Console 中使用MCL管理包和其他高级功能☆10Nov 13, 2022Updated 3 years ago
- [arXiv 2026] Official PyTorch Repository for "Coarse-Guided Visual Generation via Weighted h-Transform Sampling"☆42May 8, 2026Updated last month
- ☆15Nov 20, 2023Updated 2 years ago