maszhongming / ParaKnowTransferView external linksLinks
Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"
☆33May 9, 2024Updated last year
Alternatives and similar repositories for ParaKnowTransfer
Users that are interested in ParaKnowTransfer are comparing it to the libraries listed below
Sorting:
- The Shifted and The Overlooked: A Task-oriented Investigation of User-GPT Interactions (EMNLP 2023))☆13Dec 21, 2023Updated 2 years ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated last year
- Official PyTorch implementation of CD-MOE☆12Mar 29, 2025Updated 10 months ago
- ☆12Oct 26, 2022Updated 3 years ago
- [ACL'22] Training-free Neural Architecture Search for RNNs and Transformers☆14May 26, 2024Updated last year
- Official PyTorch implementation of "Evolving Search Space for Neural Architecture Search"☆12Aug 18, 2021Updated 4 years ago
- Codes for DATA: Differentiable ArchiTecture Approximation.☆11Jul 22, 2021Updated 4 years ago
- ICML2019 Accepted Paper. Overcoming Multi-Model Forgetting☆14Jun 5, 2019Updated 6 years ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Jun 10, 2024Updated last year
- Benchmarks for Macro Neural Architecture Search; used and described in the paper "Local Search is a Remarkably Strong Baseline for Neural…☆12Jul 25, 2024Updated last year
- Consistent dialogue generation☆16Oct 26, 2022Updated 3 years ago
- ☆20Aug 16, 2021Updated 4 years ago
- ☆19May 11, 2021Updated 4 years ago
- PyTorch implementation for OD-cheap-convolution.☆20Sep 29, 2019Updated 6 years ago
- Source code for EMNLP findings paper "Open-Vocabulary Argument Role Prediction for Event Extraction"☆19Nov 5, 2022Updated 3 years ago
- GIFT (ACL 2023) & MPC-BERT (ACL 2021) for Multi-Party Conversation Understanding☆41Jul 12, 2023Updated 2 years ago
- ☆18Nov 6, 2019Updated 6 years ago
- Reducing Channel Redundancy in Convolutional Neural Networks by Features Recombining (TIP 2021)☆20Mar 1, 2023Updated 2 years ago
- ArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement …☆44Aug 6, 2025Updated 6 months ago
- ☆19Mar 5, 2019Updated 6 years ago
- FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation☆51Aug 24, 2025Updated 5 months ago
- Revisiting Parameter Sharing for Automatic Neural Channel Number Search, NeurIPS 2020☆22Nov 15, 2020Updated 5 years ago
- ☆26Apr 12, 2022Updated 3 years ago
- Code for "ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models" (ICLR 2024)☆20Feb 16, 2024Updated last year
- Single Path One-Shot NAS MXNet implementation with Supernet training and searching☆19Dec 23, 2019Updated 6 years ago
- Code of our Neurips2020 paper "Auto Learning Attention", coming soon☆22Apr 14, 2021Updated 4 years ago
- ☆18Oct 16, 2019Updated 6 years ago
- D^2-MoE: Delta Decompression for MoE-based LLMs Compression☆72Mar 25, 2025Updated 10 months ago
- Repository for "Accelerating Neural Architecture Search using Performance Prediction" (ICLR Workshop 2018)☆18Mar 21, 2018Updated 7 years ago
- Codebase for Context-aware Meta-learned Loss Scaling (CaMeLS). https://arxiv.org/abs/2305.15076.☆25Jan 23, 2024Updated 2 years ago
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆56Feb 28, 2023Updated 2 years ago
- [NeurIPS 2020] "Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?" by Shen Yan, Yu Zheng, Wei Ao, X…☆50Jan 19, 2021Updated 5 years ago
- ☆26Dec 10, 2020Updated 5 years ago
- TF-FD☆20Nov 19, 2022Updated 3 years ago
- [NeurIPS 2021] “Stronger NAS with Weaker Predictors“, Junru Wu, Xiyang Dai, Dongdong Chen, Yinpeng Chen, Mengchen Liu, Ye Yu, Zhangyang W…☆27Sep 23, 2022Updated 3 years ago
- ☆28Dec 2, 2024Updated last year
- A highly modular PyTorch framework with a focus on Neural Architecture Search (NAS).☆23Dec 3, 2021Updated 4 years ago
- Official PyTorch implementation of "Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets" (ICLR 2023 notable top 25%)☆26Mar 18, 2024Updated last year
- ☆63Oct 17, 2023Updated 2 years ago