kiddyboots216 / lottery-ticket-adaptation
Lottery Ticket Adaptation
☆39Updated 5 months ago
Alternatives and similar repositories for lottery-ticket-adaptation:
Users that are interested in lottery-ticket-adaptation are comparing it to the libraries listed below
- ☆31Updated 3 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆29Updated last month
- Efficient Scaling laws and collaborative pretraining.☆16Updated 2 months ago
- ☆15Updated 2 weeks ago
- Exploration of automated dataset selection approaches at large scales.☆39Updated last month
- Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆15Updated last month
- This repo is based on https://github.com/jiaweizzhao/GaLore☆26Updated 7 months ago
- Code for reproducing our paper "Low Rank Adapting Models for Sparse Autoencoder Features"☆10Updated 3 weeks ago
- ☆24Updated 7 months ago
- ☆13Updated 4 months ago
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆44Updated this week
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 7 months ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆30Updated last month
- Official implementation of ECCV24 paper: POA☆24Updated 8 months ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More☆28Updated 2 months ago
- Official Repository of Are Your LLMs Capable of Stable Reasoning?☆25Updated last month
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆19Updated 4 months ago
- A testbed for agents and environments that can automatically improve models through data generation.☆23Updated last month
- NeurIPS 2024 tutorial on LLM Inference☆42Updated 4 months ago
- A repository for research on medium sized language models.☆76Updated 11 months ago
- ☆17Updated 3 months ago
- Using FlexAttention to compute attention with different masking patterns☆43Updated 7 months ago
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Updated last year
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Updated last year
- Knowledge Unlearning for Large Language Models☆25Updated 3 weeks ago
- ☆44Updated 11 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆37Updated last year
- ☆22Updated last week
- Code for "Merging Text Transformers from Different Initializations"☆20Updated 2 months ago