neilwen987 / CSR_Adaptive_RepLinks
Official Code for Paper: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
☆133Updated last month
Alternatives and similar repositories for CSR_Adaptive_Rep
Users that are interested in CSR_Adaptive_Rep are comparing it to the libraries listed below
Sorting:
- ☆236Updated last week
- Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas☆99Updated last week
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers☆75Updated 7 months ago
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆179Updated 4 months ago
- [ICLR 2026] Geometric-Mean Policy Optimization☆99Updated 2 weeks ago
- ☆91Updated last year
- JudgeLRM: Large Reasoning Models as a Judge☆41Updated last week
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆57Updated 8 months ago
- Model Merging with SVD to Tie the KnOTS [ICLR 2025]☆85Updated 10 months ago
- The official github repo for "Diffusion Language Models are Super Data Learners".☆221Updated 3 months ago
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆32Updated last year
- LLaDA2.0 is the diffusion language model series developed by InclusionAI team, Ant Group.☆240Updated last month
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆120Updated last month
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆154Updated 7 months ago
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆116Updated last year
- [NeurIPS 2025] Thinkless: LLM Learns When to Think☆251Updated 4 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆50Updated this week
- AnchorAttention: Improved attention for LLMs long-context training☆213Updated last year
- [NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs☆94Updated last year
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆52Updated last year
- Easy and Efficient dLLM Fine-Tuning☆209Updated 3 weeks ago
- Defeating the Training-Inference Mismatch via FP16☆181Updated 2 months ago
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆36Updated 11 months ago
- ☆50Updated last year
- [NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"☆38Updated last year
- PeRL: Parameter-Efficient Reinforcement Learning☆68Updated 3 weeks ago
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆182Updated 7 months ago
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆102Updated 3 months ago
- Code for Heima☆59Updated 9 months ago
- ☆21Updated 4 months ago