Pytorch Implementation of "Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models", AAAI 2025
☆38Feb 4, 2026Updated 2 months ago
Alternatives and similar repositories for Multi-Level-OT
Users that are interested in Multi-Level-OT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models". A general white-box KD framework for both same…☆62Mar 21, 2026Updated 3 weeks ago
- ☆32Mar 13, 2024Updated 2 years ago
- ☆21Jul 9, 2025Updated 9 months ago
- PANDA: Prompt Transfer Meets Knowledge Distillation for Efficient Model Adaptation☆16Mar 28, 2023Updated 3 years ago
- LLMOA: A novel large language model assisted hyper-heuristic optimization algorithm☆16Mar 13, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- 本项目已被合并至官方Chiplab中☆13Jan 13, 2025Updated last year
- Fine-tune GPT2 to generate fake job experiences☆11Jan 17, 2023Updated 3 years ago
- ☆11Feb 3, 2025Updated last year
- Implementation of DeepMind's "Sobolev Training for Neural Networks"☆11Apr 2, 2018Updated 8 years ago
- ☆35Aug 18, 2025Updated 7 months ago
- A small demo for training cnn with pytorch.☆11Dec 15, 2018Updated 7 years ago
- Mitigating Lost-in-Retrieval Problems in Retrieval Augmented Multi-Hop Question Answering, ACL 2025☆21Oct 28, 2025Updated 5 months ago
- [ACL 2026 (Main)] LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification☆79Jul 14, 2025Updated 8 months ago
- [ICLR 2025] Official repository for the paper "Influence-Guided Diffusion for Dataset Distillation".☆15Feb 12, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Free chrome extension to summarize articles on the web using ChatGPT AI☆18Jan 7, 2023Updated 3 years ago
- Less is More: Task-aware Layer-wise Distillation for Language Model Compression (ICML2023)☆40Aug 28, 2023Updated 2 years ago
- Uses C-GAN for feature hallucination of missing modalities for hyperspectral data. TensorFlow implementation of ICCV '19 paper☆11Sep 9, 2020Updated 5 years ago
- LLMs Learn Task Heuristics from Demonstrations: A Heuristic-Driven Prompting Strategy for Document-Level Event Argument Extraction (ACL 2…☆14Aug 12, 2024Updated last year
- 使用Bert-BiLstm-CRF做中文命名实体识别,使用的数据集来自https://aistudio.baidu.com/aistudio/competition/detail/802/0/datasets☆18Mar 1, 2024Updated 2 years ago
- [EMNLP'25 main] This is the official repo for the paper, Can LLMs be Good Graph Judge for Knowledge Graph Construction?☆27Sep 23, 2025Updated 6 months ago
- Code for the AACL 2022 Paper "This Patient Looks Like That Patient: Prototypical Networks for Interpretable Diagnosis Prediction from Cli…☆12Nov 18, 2022Updated 3 years ago
- Implementation of several knowledge distillation techniques on PyTorch☆15Feb 25, 2019Updated 7 years ago
- ResNet-50 for TsinghuaDog classification☆10Feb 2, 2021Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆16Jun 8, 2023Updated 2 years ago
- Role-Wise Data Augmentation for Knowledge Distillation☆19Nov 22, 2022Updated 3 years ago
- Unofficial guide for ysyx students applying to ShanghaiTech University☆23Mar 31, 2026Updated last week
- A general-purpose coding agent that runs inside an NVIDIA OpenShell sandbox, orchestrated by Deep Agents and powered by NVIDIA Nemotron. …☆111Updated this week
- [AAAI 2023] IterDE: An Iterative Knowledge Distillation Framework for Knowledge Graph Embeddings☆10Apr 3, 2024Updated 2 years ago
- Zero-Shot Knowledge Distillation in Deep Networks☆67Apr 16, 2022Updated 3 years ago
- Official code for Cumulative Spatial Knowledge Distillation for Vision Transformers (ICCV-2023) https://openaccess.thecvf.com/content/ICC…☆15Nov 5, 2023Updated 2 years ago
- Scene classification baseline. Test Acc:90.14%☆16Jul 9, 2019Updated 6 years ago
- a codebase for multi label classification with PyTorch.☆15Nov 23, 2022Updated 3 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- This repo holds the code for: {Transformer-based Spatio-temporal Analysis for Automatic Classification of Aortic Stenosis Severity from B…☆13Nov 29, 2022Updated 3 years ago
- Non-Autoregressive Math Word Problem Solver with Unified Tree Structure☆12Jan 13, 2024Updated 2 years ago
- Deep learning techniques for atherosclerotic plaque detection in the carotid artery☆16Jun 16, 2022Updated 3 years ago
- Towards Optimal Structured CNN Pruning via Generative Adversarial Learning☆18Mar 23, 2019Updated 7 years ago
- The code and data for the GPT-4 based benchmark in the vicuna blog post☆43Aug 2, 2023Updated 2 years ago
- Learning Multi-Attention Convolutional Neural Network for Fine-GrainedImage Recognition☆12May 28, 2021Updated 4 years ago
- [ICLR 2021 Spotlight Oral] "Undistillable: Making A Nasty Teacher That CANNOT teach students", Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Che…☆83Dec 30, 2021Updated 4 years ago