The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".
☆32Nov 12, 2024Updated last year
Alternatives and similar repositories for SparsingLaw
Users that are interested in SparsingLaw are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆21Jun 4, 2025Updated last year
- ☆12Jun 13, 2025Updated last year
- Official Implementation of wd1☆30Sep 25, 2025Updated 8 months ago
- Source code for the ACL'2025 paper titled "Unveiling privacy risks in llm agent memory"☆32Dec 2, 2025Updated 6 months ago
- Fork of Flame repo for training of some new stuff in development☆19Jun 1, 2026Updated last week
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- DICE: Detecting In-distribution Data Contamination with LLM's Internal State☆12Sep 21, 2024Updated last year
- ☆12Sep 8, 2023Updated 2 years ago
- ☆11Jun 11, 2025Updated last year
- [NDSS 2026] Official repo for Odysseus: Jailbreaking Commercial Multimodal LLM-integrated Systems via Dual Steganography☆56Mar 14, 2026Updated 3 months ago
- TMMA: A Tiled Matrix Multiplication Accelerator for Self-Attention Projections in Transformer Models, optimized for edge deployment on Xi…☆34Apr 7, 2026Updated 2 months ago
- [ACL 2023 Findings] Emergent Modularity in Pre-trained Transformers☆26Jun 7, 2023Updated 3 years ago
- [NeurIPS 2025] Reasoning Models Better Express Their Confidence"☆23Nov 19, 2025Updated 6 months ago
- This repository provides the official implementation of QSVD, a method for efficient low-rank approximation that unifies Query-Key-Value …☆27May 16, 2026Updated 3 weeks ago
- ☆19Apr 16, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- To Think or Not to Think: Exploring the Unthinking Vulnerability in Large Reasoning Models☆33May 21, 2025Updated last year
- ☆23Oct 22, 2025Updated 7 months ago
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- ☆13Sep 7, 2024Updated last year
- Resa: Transparent Reasoning Models via SAEs☆49Sep 23, 2025Updated 8 months ago
- Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders☆19May 23, 2025Updated last year
- The official implementation of ICLR 2025 paper "Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models".☆18Apr 25, 2025Updated last year
- Learning to Skip the Middle Layers of Transformers☆17Aug 7, 2025Updated 10 months ago
- The implement of paper:"ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability"☆67Jun 3, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code for the paper "Firewalls to Secure Dynamic LLM Agentic Networks"☆30Jun 6, 2025Updated last year
- ☆15Mar 20, 2025Updated last year
- [ACL 2025 Findings] Implicit Reasoning in Transformers is Reasoning through Shortcuts☆18Mar 11, 2025Updated last year
- PyTorch implementation of StableMask (ICML'24)☆15Jun 27, 2024Updated last year
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆30Jul 24, 2025Updated 10 months ago
- Code for the paper "Cottention: Linear Transformers With Cosine Attention"☆20Nov 15, 2025Updated 6 months ago
- Sparse Backpropagation for Mixture-of-Expert Training☆30Jul 2, 2024Updated last year
- Unofficial implementation of the Ask-LLM paper 'How to Train Data-Efficient LLMs', arXiv:2402.09668.☆12Jun 19, 2024Updated last year
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆18Nov 4, 2025Updated 7 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers☆29Mar 1, 2025Updated last year
- ☆33Nov 11, 2024Updated last year
- ☆11Jan 21, 2021Updated 5 years ago
- ☆21Apr 3, 2026Updated 2 months ago
- ☆16Jul 23, 2024Updated last year
- ☆32Jun 5, 2025Updated last year
- [NeurIPS 2024] VeLoRA : Memory Efficient Training using Rank-1 Sub-Token Projections☆22Oct 15, 2024Updated last year