Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"
☆33May 9, 2024Updated last year
Alternatives and similar repositories for ParaKnowTransfer
Users that are interested in ParaKnowTransfer are comparing it to the libraries listed below
Sorting:
- The Shifted and The Overlooked: A Task-oriented Investigation of User-GPT Interactions (EMNLP 2023))☆13Dec 21, 2023Updated 2 years ago
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- Radiantloom Email Assist 7B is an email-assistant large language model fine-tuned from Zephyr-7B-Beta, over a custom-curated dataset of 1…☆14Jan 19, 2024Updated 2 years ago
- Reference implementation for Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model☆45Oct 1, 2025Updated 5 months ago
- Official PyTorch implementation of CD-MOE☆12Mar 29, 2025Updated 11 months ago
- Official PyTorch implementation of "Evolving Search Space for Neural Architecture Search"☆12Aug 18, 2021Updated 4 years ago
- [ICML2024] DetKDS: Knowledge Distillation Search for Object Detectors☆19Jul 11, 2024Updated last year
- Codes for DATA: Differentiable ArchiTecture Approximation.☆11Jul 22, 2021Updated 4 years ago
- ☆10Jul 25, 2024Updated last year
- ICML2019 Accepted Paper. Overcoming Multi-Model Forgetting☆14Jun 5, 2019Updated 6 years ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Jun 10, 2024Updated last year
- Consistent dialogue generation☆16Oct 26, 2022Updated 3 years ago
- Benchmarks for Macro Neural Architecture Search; used and described in the paper "Local Search is a Remarkably Strong Baseline for Neural…☆12Jul 25, 2024Updated last year
- Data-free knowledge distillation using Gaussian noise (NeurIPS paper)☆15Mar 24, 2023Updated 2 years ago
- Auto-Prox-AAAI24☆14Apr 30, 2024Updated last year
- ☆20Aug 16, 2021Updated 4 years ago
- ☆19May 11, 2021Updated 4 years ago
- ☆17Jul 10, 2022Updated 3 years ago
- PyTorch implementation for OD-cheap-convolution.☆20Sep 29, 2019Updated 6 years ago
- GIFT (ACL 2023) & MPC-BERT (ACL 2021) for Multi-Party Conversation Understanding☆41Jul 12, 2023Updated 2 years ago
- Reducing Channel Redundancy in Convolutional Neural Networks by Features Recombining (TIP 2021)☆20Mar 1, 2023Updated 3 years ago
- ☆18Nov 6, 2019Updated 6 years ago
- [NeurIPS 2019] E2-Train: Training State-of-the-art CNNs with Over 80% Less Energy☆21Nov 18, 2019Updated 6 years ago
- Repository for the EMNLP 2023 Demo Paper "Reaction Miner: An Integrated System for Chemical Reaction Extraction from Textual Data"☆19Jan 27, 2025Updated last year
- A learning rate recommending and benchmarking tool.☆20May 19, 2023Updated 2 years ago
- ArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement …☆45Aug 6, 2025Updated 7 months ago
- [NeurIPS 2024] VeLoRA : Memory Efficient Training using Rank-1 Sub-Token Projections☆21Oct 15, 2024Updated last year
- Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024☆22Jun 26, 2024Updated last year
- ☆19Mar 5, 2019Updated 7 years ago
- Code and dataset for the emnlp paper titled Instruct and Extract: Instruction Tuning for On-Demand Information Extraction☆54Jan 2, 2024Updated 2 years ago
- ☆26Apr 12, 2022Updated 3 years ago
- Code for paper "Incorporating Multimodal Information in Open-Domain Web Keyphrase Extraction"☆19Jan 28, 2021Updated 5 years ago
- Momentum Decoding: Open-ended Text Generation as Graph Exploration☆19Jan 27, 2023Updated 3 years ago
- Discovering directional relations via minimum predictive information regularization☆23Jan 13, 2020Updated 6 years ago
- Revisiting Parameter Sharing for Automatic Neural Channel Number Search, NeurIPS 2020☆22Nov 15, 2020Updated 5 years ago
- FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation☆51Aug 24, 2025Updated 6 months ago
- Code for "ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models" (ICLR 2024)☆20Feb 16, 2024Updated 2 years ago
- Single Path One-Shot NAS MXNet implementation with Supernet training and searching☆19Dec 23, 2019Updated 6 years ago
- A simple pytorch implementation of Differentiable Architecture Search (DARTS)☆22Aug 27, 2019Updated 6 years ago