Olivia-fsm / DoGE
Codebase for ICML submission "DOGE: Domain Reweighting with Generalization Estimation"
☆17Updated last year
Alternatives and similar repositories for DoGE:
Users that are interested in DoGE are comparing it to the libraries listed below
- ☆66Updated 3 years ago
- Learning adapter weights from task descriptions☆17Updated last year
- Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models☆46Updated last year
- ☆49Updated last year
- ☆93Updated last year
- Augmenting Statistical Models with Natural Language Parameters☆26Updated 7 months ago
- EMNLP 2024: Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue☆35Updated 5 months ago
- Repo for ACL2023 Findings paper "Emergent Modularity in Pre-trained Transformers"☆23Updated last year
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]☆29Updated 3 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆109Updated last year
- Exploration of automated dataset selection approaches at large scales.☆39Updated 2 months ago
- [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors☆75Updated 4 months ago
- Implementation of Gradient Information Optimization (GIO) for effective and scalable training data selection☆13Updated last year
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆71Updated 2 years ago
- Simple and scalable tools for data-driven pretraining data selection.☆23Updated 2 months ago
- ☆47Updated last year
- This is the repository for "Model Merging by Uncertainty-Based Gradient Matching", ICLR 2024.☆27Updated 11 months ago
- AI Logging for Interpretability and Explainability🔬☆115Updated 10 months ago
- Test-time-training on nearest neighbors for large language models☆40Updated last year
- LoFiT: Localized Fine-tuning on LLM Representations☆37Updated 3 months ago
- ☆4Updated 3 months ago
- Long Context Extension and Generalization in LLMs☆53Updated 7 months ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Updated 2 weeks ago
- ☆35Updated last year
- ☆62Updated last year
- TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models☆67Updated last year
- ☆40Updated last year
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆35Updated 6 months ago
- Code for the paper "Spectral Editing of Activations for Large Language Model Alignments"☆22Updated 4 months ago
- [ICLR'25 Spotlight] Min-K%++: Improved baseline for detecting pre-training data of LLMs☆38Updated 2 months ago