Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe
☆441May 12, 2026Updated 2 weeks ago
Alternatives and similar repositories for OPD
Users that are interested in OPD are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Summary of courses taken during undergraduate studies at ShanghaiTech University, master's studies at Tsinghua University☆52Feb 14, 2026Updated 3 months ago
- [ICLR 2026] The official repo of "MMTok: Multimodal Coverage Maximization for Efficient Inference of VLMs"☆38Mar 11, 2026Updated 2 months ago
- ☆142Apr 14, 2026Updated last month
- The official implemention of "Depth-Breadth Synergy in RLVR: Unlocking LLM Reasoning Gains with Adaptive Exploration" (ICML 2026)☆24Feb 4, 2026Updated 3 months ago
- This is the official repo for the paper "AMO-Bench: Large Language Models Still Struggle in High School Math Competitions".☆126Feb 6, 2026Updated 3 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [EMNLP 2024 Findings] Wrong-of-Thought: An Integrated Reasoning Framework with Multi-Perspective Verification and Wrong Information☆13Oct 1, 2024Updated last year
- Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model☆13Dec 29, 2024Updated last year
- A user-friendly & efficient knowledge distillation framework for LLMs, supporting off-policy, on-policy (OPD), cross-tokenizer, multimoda…☆146Updated this week
- [RSS 2025] PartInstruct: Part-level Instruction Following for Fine-grained Robot Manipulation☆19Mar 4, 2026Updated 2 months ago
- ☆257May 10, 2026Updated 2 weeks ago
- A Holistic Embodied Cognition Benchmark☆19Apr 3, 2025Updated last year
- survery of small language models☆18Jul 23, 2024Updated last year
- On Path to Multimodal Generalist: General-Level and General-Bench☆18Jul 11, 2025Updated 10 months ago
- ☆34Sep 19, 2025Updated 8 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- EgoToM is an egocentric theory-of-mind benchmark built on Ego4D videos, containing multi-choice questions that evaluate multimodal large …☆15Apr 1, 2025Updated last year
- ☆22May 19, 2025Updated last year
- A PyTorch-Lightning based deep learning framework.☆11Apr 15, 2026Updated last month
- Official implementation of [CVPR 2025] RePerformer: Immersive Human-centric Volumetric Videos from Playback to Photoreal Reperformance☆25Sep 9, 2025Updated 8 months ago
- Official Repository of LatentSeek☆83Jun 6, 2025Updated 11 months ago
- [Paper][EMNLP 2025] Enrich-on-Graph: Query-Graph Alignment for Complex Reasoning with LLM Enriching☆34Feb 8, 2026Updated 3 months ago
- Code for EMNLP2023 paper "MolCA: Molecular Graph-Language Modeling with Cross-Modal Projector and Uni-Modal Adapter".☆12Dec 27, 2023Updated 2 years ago
- DiTAS: Quantizing Diffusion Transformers via Enhanced Activation Smoothing (WACV 2025)☆13Feb 7, 2026Updated 3 months ago
- ☆33May 27, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ShanghaiTech SI140A Probability & Statistics for EECS, Spring 2023, Spring 2024.☆24May 1, 2026Updated 3 weeks ago
- [EMNLP 2025 Findings] Familiarity-aware Evidence Compression for Retrieval Augmented Generation☆15Aug 20, 2025Updated 9 months ago
- Spatial Aptitude Training for Multimodal Langauge Models☆32Feb 8, 2026Updated 3 months ago
- Dataset Quantization with Active Learning based Adaptive Sampling [ECCV 2024]☆10Jul 9, 2024Updated last year
- [ICLR26] Beyond Real: Imaginary Extension of Rotary Position Embeddings for Long-Context LLMs☆33Dec 9, 2025Updated 5 months ago
- ☆26Nov 20, 2025Updated 6 months ago
- daVinci-Agency: Unlocking Long-Horizon Agency Data-Efficiently☆39Feb 4, 2026Updated 3 months ago
- This is the official code of the paper "SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection"☆18Aug 23, 2023Updated 2 years ago
- Non-Autoregressive Math Word Problem Solver with Unified Tree Structure☆12Jan 13, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Code for the Click-Through Rate Prediction Kaggle challenge from Avazu☆11Feb 5, 2017Updated 9 years ago
- 🚀 First survey on Attention Sink in Transformers — 180+ papers on utilization, interpretation, and mitigation.☆79Apr 16, 2026Updated last month
- 🕵 Code for our EMNLP 2025 Main paper: "FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games"☆25Apr 26, 2026Updated last month
- The official repo for "CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models"☆33Mar 26, 2026Updated 2 months ago
- ☆18May 31, 2025Updated 11 months ago
- Official repository for "CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation"☆90Dec 15, 2025Updated 5 months ago
- Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents☆40Apr 13, 2026Updated last month