ablghtianyi / ICL_Modular_Arithmetic
☆17Updated 3 months ago
Alternatives and similar repositories for ICL_Modular_Arithmetic:
Users that are interested in ICL_Modular_Arithmetic are comparing it to the libraries listed below
- ☆17Updated 7 months ago
- ☆12Updated 11 months ago
- ☆26Updated last month
- ☆17Updated 2 weeks ago
- Offcial Repo of Paper "Eliminating Position Bias of Language Models: A Mechanistic Approach""☆11Updated 5 months ago
- Implementation of PaCE: Parsimonious Concept Engineering for Large Language Models (NeurIPS 2024)☆32Updated 3 months ago
- Lightweight Adapting for Black-Box Large Language Models☆19Updated last year
- Unofficial Implementation of Selective Attention Transformer☆15Updated 3 months ago
- [ICLR 2025] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates (Oral)☆69Updated 3 months ago
- [NeurIPS2024] Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging☆48Updated 2 months ago
- ☆28Updated 3 months ago
- Efficient Scaling laws and collaborative pretraining.☆14Updated 3 weeks ago
- [ICML 2024] Junk DNA Hypothesis: A Task-Centric Angle of LLM Pre-trained Weights through Sparsity; Lu Yin*, Ajay Jaiswal*, Shiwei Liu, So…☆16Updated 8 months ago
- Stick-breaking attention☆43Updated last month
- ☆27Updated 3 months ago
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆25Updated last week
- Self-Supervised Alignment with Mutual Information☆16Updated 8 months ago
- Official code for the paper "Attention as a Hypernetwork"☆23Updated 7 months ago
- PyTorch codes for the paper "An Empirical Study of Multimodal Model Merging"☆38Updated last year
- Codebase for Instruction Following without Instruction Tuning☆33Updated 4 months ago
- Applies ROME and MEMIT on Mamba-S4 models☆14Updated 10 months ago
- Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxiang Li, Lu Yi…☆16Updated last month
- Official Implementation Of The Paper: `DeciMamba: Exploring the Length Extrapolation Potential of Mamba'☆23Updated 6 months ago
- Universal Neurons in GPT2 Language Models☆27Updated 8 months ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆70Updated 3 months ago
- Official Implementation of UA^{2}-Agent and other baseline algorithms of "Towards Unified Alignment Between Agents, Humans, and Environme…☆16Updated 3 months ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆25Updated 10 months ago
- ☆30Updated 11 months ago