DaShenZi721 / HRALinks
☆29Updated 3 weeks ago
Alternatives and similar repositories for HRA
Users that are interested in HRA are comparing it to the libraries listed below
Sorting:
- source code for paper "Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models"☆26Updated last year
- (ICML 2023) Discover and Cure: Concept-aware Mitigation of Spurious Correlation☆41Updated last year
- Official implementation for our paper "Scaling Diffusion Transformers Efficiently via μP".☆69Updated last month
- SLTrain: a sparse plus low-rank approach for parameter and memory efficient pretraining (NeurIPS 2024)☆31Updated 7 months ago
- Official Code Repository for the paper "Continuous Diffusion Model for Language Modeling".☆33Updated 3 months ago
- A list of papers for group meeting☆16Updated last month
- Official implementation of GOAT model (ICML2023)☆37Updated last year
- Official Implementation of Paper "Learning to Jump: Thinning and Thickening Latent Counts for Generative Modeling" (ICML 2023)☆10Updated 2 years ago
- Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024☆22Updated last year
- ☆16Updated 3 months ago
- Welcome to the 'In Context Learning Theory' Reading Group☆28Updated 7 months ago
- Official PyTorch implementation for "Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data" (ICLR…☆49Updated last month
- Listing some diffusion papers in NLP domain I have read, text generation is main, table will continue to be updated.☆46Updated 3 months ago
- Official PyTorch implementation of DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs (ICML 2025 Oral)☆23Updated 2 weeks ago
- ☆15Updated 10 months ago
- Official PyTorch implementation for "Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations"☆40Updated last year
- Code for "Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective"☆20Updated last year
- Benchmark for Natural Temporal Distribution Shift (NeurIPS 2022)☆66Updated 2 years ago
- Codes for Merging Large Language Models☆32Updated 10 months ago
- The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"☆38Updated 8 months ago
- [NeurIPS 2023 Spotlight] Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training☆35Updated 2 months ago
- Pytorch implementation of ICML-2024 "Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching"☆24Updated last year
- PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆37Updated 7 months ago
- Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation (ICML'24 Oral)☆13Updated 11 months ago
- Code for paper: Aligning Large Language Models with Representation Editing: A Control Perspective☆32Updated 5 months ago
- Official repository for paper "DeepCritic: Deliberate Critique with Large Language Models"☆30Updated last month
- EMPO, A Fully Unsupervised RLVR Method☆43Updated this week
- Deep Learning & Information Bottleneck☆60Updated 2 years ago
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆90Updated 8 months ago
- AnchorAttention: Improved attention for LLMs long-context training☆208Updated 5 months ago