Official repository for the paper "Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation"
☆246May 28, 2026Updated last month
Alternatives and similar repositories for G-OPD
Users that are interested in G-OPD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A Comprehensive Benchmark of Imbalanced Graph Learning (Accepted by ICLR 2025 Spotlight)☆15Apr 17, 2025Updated last year
- ☆40May 9, 2026Updated last month
- On Policy Distillation Build on top of Verl☆87May 25, 2026Updated last month
- VLS: Steering Pretrained Robot Policies via Vision–Language Models☆63Mar 29, 2026Updated 3 months ago
- Code for the paper 'Neural Variational Gradient Descent'. We perform nonparametric variational inference by transporting samples along a …☆12Jul 29, 2021Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [ICML 2025] Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling☆13May 5, 2025Updated last year
- Code for AAAI'25 paper: LLM-Powered User Simulator for Recommender System☆28Jan 6, 2025Updated last year
- The repository for ACL 2024 paper "TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models"☆35Jun 29, 2024Updated 2 years ago
- [ICML 2026] BizFinBench.v2: A Unified Offline–Online Bilingual Benchmark for Expert-Level Financial Capability Evaluation of LLMs☆44May 1, 2026Updated last month
- (ACM MM24) This is the offical repository of GIST: Improving Parameter Efficient Fine Tuning via Knowledge Interaction.☆11Jan 28, 2024Updated 2 years ago
- Stein Variational Gradient Descent with Matrix-Valued Kernels☆13Dec 4, 2019Updated 6 years ago
- Entropy-Driven GRPO with Guided Error Correction for Advantage Diversity☆22Aug 28, 2025Updated 10 months ago
- Cross-Self KV Cache Pruning for Efficient Vision-Language Inference☆10Dec 15, 2024Updated last year
- ☆21Apr 3, 2026Updated 2 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Official repository for ICLR 2025 paper "Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs"☆20Mar 18, 2025Updated last year
- ☆14Apr 30, 2025Updated last year
- CAD - Memory Efficient Convolutional Adapter for Segment Anything☆12Oct 4, 2024Updated last year
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆14Mar 20, 2024Updated 2 years ago
- ☆14May 9, 2024Updated 2 years ago
- (ICME24) This is the offical repository of iDAT: inverse Distillation Adapter-Tuning.☆13Apr 3, 2024Updated 2 years ago
- Code to reproduce results of our experiments using LoRe☆18Jun 10, 2026Updated 2 weeks ago
- Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge☆131May 24, 2026Updated last month
- [NeurIPS D&B Track 2024] Source code for the paper "Constrained Human-AI Cooperation: An Inclusive Embodied Social Intelligence Challenge…☆25May 2, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆16Apr 11, 2022Updated 4 years ago
- PyTorch implementation of various distillation approaches for continual learning of Diffusion Models.☆26Mar 4, 2025Updated last year
- ☆12Feb 12, 2024Updated 2 years ago
- Example use cases for the GPT-4 Vision API☆19Nov 26, 2023Updated 2 years ago
- DoctorAgent-RL: A Multi-Agent Collaborative Reinforcement Learning System for Multi-Turn Clinical Dialogue☆90Jan 23, 2026Updated 5 months ago
- [ICCV 2025] Official repo of "EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow"☆27Oct 16, 2025Updated 8 months ago
- Reasoning Activation in LLMs via Small Model Transfer (NeurIPS 2025)☆22Oct 16, 2025Updated 8 months ago
- [ICML 2024] Official repository of ICML 2024 - RoboMP2: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language…☆11Apr 4, 2026Updated 2 months ago
- Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts☆26Feb 23, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆28Sep 15, 2025Updated 9 months ago
- Original implementation of SmartRAG: Jointly Learn RAG-Related Tasks From the Environment Feedback (ICLR 2025)☆18Feb 17, 2025Updated last year
- code for the paper "CoReS: Orchestrating the Dance of Reasoning and Segmentation"☆23Nov 24, 2025Updated 7 months ago
- Solving High Frequency and Multi-Scale PDEs with Gaussian Processes (ICLR 2024)☆26Jun 7, 2024Updated 2 years ago
- ☆17Jan 31, 2024Updated 2 years ago
- [IROS 2025] ReBot: Scaling Robot Learning with Real-to-Sim-to-Real Robotic Video Synthesis☆26May 17, 2025Updated last year
- Official implementation for “HarmonyGuard: Toward Safety and Utility in Web Agents via Adaptive Policy Enhancement and Dual-Objective Opt…☆29Jan 10, 2026Updated 5 months ago