A curated collection of papers, technical reports, frameworks, and tools for on-policy distillation (OPD) of large language models
☆276Jun 8, 2026Updated this week
Alternatives and similar repositories for awesome-on-policy-distillation
Users that are interested in awesome-on-policy-distillation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Aligntune : A Modular Toolkit for Post Training Alignment of LLMs☆36Jun 2, 2026Updated last week
- Code for the paper "Overconfidence is a Dangerous Thing: Mitigating Membership Inference Attacks by Enforcing Less Confident Prediction" …☆13Sep 6, 2023Updated 2 years ago
- ☆13Mar 15, 2022Updated 4 years ago
- [CVPR'26] VisPlay: Self-Evolving Vision-Language Models☆57Feb 25, 2026Updated 3 months ago
- ☆20Jan 21, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Official implementation of MATPO: Multi-Agent Tool-Integrated Policy Optimization.☆81Oct 31, 2025Updated 7 months ago
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆40Nov 11, 2024Updated last year
- Official Codebase for "Generative Multimodal Model Features Are Discriminative Vision-Language Classifiers"☆27Jun 7, 2025Updated last year
- Code for Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation (EVOL-RL).☆51Mar 31, 2026Updated 2 months ago
- Official Repo for FoodieQA paper (EMNLP 2024)☆20Jun 26, 2025Updated 11 months ago
- PDF Extraction Toolkit (wraps and trains LayoutLM)☆10Oct 8, 2021Updated 4 years ago
- This is the official implementation of the paper titled "Comprehensive Comparison of Vision Transformers and Traditional Convolutional Ne…☆14Mar 4, 2025Updated last year
- Auto1111 port of NVlab's adversarial purification method that uses the forward and reverse processes of diffusion models to remove advers…☆13Aug 8, 2023Updated 2 years ago
- Base repo for paper 'StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval'☆15Apr 27, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- meta-comprehensive-rag-benchmark-kdd-cup-2024 phase1 task1 rank3☆21Jun 21, 2024Updated last year
- An End-to-End Model with Adaptive Filtering for Retrieval-Augmented Generation☆16Oct 27, 2024Updated last year
- 基于adaboost的SVM预测股票价格☆11Mar 4, 2018Updated 8 years ago
- FeatureAlignment = Alignment + Mechanistic Interpretability☆35Mar 8, 2025Updated last year
- Code for paper "Omni-SafetyBench: A Benchmark for Safety Evaluation of Audio-Visual Large Language Models".☆114Sep 28, 2025Updated 8 months ago
- ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs☆29Aug 15, 2025Updated 9 months ago
- ICML2024-ReconBoost: Boosting Can Achieve Modality Reconcilement☆29May 2, 2025Updated last year
- ☆61Feb 9, 2026Updated 4 months ago
- The official repo for LIFT: Language-Image Alignment with Fixed Text Encoders☆42Jun 10, 2025Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- DCEN☆13Aug 12, 2021Updated 4 years ago
- 🔍 Awesome Agentic Search is a curated list of papers, tools, and resources on agentic search—where AI agents plan, search, and reason to…☆57Aug 28, 2025Updated 9 months ago
- [NeurIPS 2025] First SFT, Second RL, Third UPT: Continual Improving Multi-Modal LLM Reasoning via Unsupervised Post-Training☆87Oct 29, 2025Updated 7 months ago
- 天池-新冠疫情相似句对判定大赛 大白_Rank6☆13Apr 8, 2020Updated 6 years ago
- ☆33Jan 8, 2026Updated 5 months ago
- Implementation and Benchmark Splits to study Out-of-Distribution Generalization in Deep Metric Learning.☆25Oct 2, 2021Updated 4 years ago
- ☆18May 28, 2021Updated 5 years ago
- Code of "A Semantic-based Method for Unsupervised Commonsense Question Answering"☆14Jul 29, 2021Updated 4 years ago
- TensorRT-in-Action 是一个 GitHub 代码库,提供了使用 TensorRT 的代码示例,并有对应 Jupyter Notebook。☆15Jun 1, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- 闲聊机器人☆11Aug 12, 2020Updated 5 years ago
- ☆13Jul 21, 2025Updated 10 months ago
- [AAAI'25] CharacterBench: Benchmarking Character Customization of Large Language Models☆23Aug 1, 2025Updated 10 months ago
- Procedural city generation.☆13Oct 15, 2022Updated 3 years ago
- A free tool that helps you transcribe, translate, and summarize videos in any language.☆18Feb 27, 2024Updated 2 years ago
- Official MATPAC implementation and trained model's weights☆35Jun 2, 2026Updated last week
- 100% 로컬 AI 코딩 에이전트 — Antigravity/VS Code/Cursor에서 오프라인으로 파일 생성, 코드 편집, 터미널 실행. Ollama 기반.☆145Updated this week