neilwen987 / CSR_Adaptive_RepLinks
Official Code for Paper: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation
☆132Updated 6 months ago
Alternatives and similar repositories for CSR_Adaptive_Rep
Users that are interested in CSR_Adaptive_Rep are comparing it to the libraries listed below
Sorting:
- LLaDA2.0 is the diffusion language model series developed by InclusionAI team, Ant Group.☆218Updated 3 weeks ago
- [ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.☆172Updated 3 months ago
- [NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models☆56Updated 7 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆50Updated 8 months ago
- Official PyTorch implementation and models for paper "Diffusion Beats Autoregressive in Data-Constrained Settings". We find diffusion mod…☆118Updated 2 months ago
- [ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers☆74Updated 6 months ago
- Geometric-Mean Policy Optimization☆96Updated last month
- JudgeLRM: Large Reasoning Models as a Judge☆40Updated last month
- Diffusion Language Models For Code Infilling Beyond Fixed-size Canvas☆97Updated 3 months ago
- Code for paper "Unraveling Cross-Modality Knowledge Conflicts in Large Vision-Language Models."☆50Updated last year
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆31Updated 8 months ago
- Esoteric Language Models☆108Updated last month
- [ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)☆151Updated 6 months ago
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆31Updated last year
- AnchorAttention: Improved attention for LLMs long-context training☆213Updated 11 months ago
- ☆138Updated 10 months ago
- Easy and Efficient dLLM Fine-Tuning☆190Updated 3 weeks ago
- [ACL 2025] A Generalizable and Purely Unsupervised Self-Training Framework☆71Updated 7 months ago
- ☆91Updated last year
- Code for "Language Models Can Learn from Verbal Feedback Without Scalar Rewards"☆55Updated last week
- [ICLR 2025 Oral] "Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free"☆86Updated last year
- [COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?☆82Updated 11 months ago
- ☆371Updated 2 months ago
- [Technical Report] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with …☆63Updated last year
- Decomposing and Editing Predictions by Modeling Model Computation☆139Updated last year
- One-shot Entropy Minimization☆187Updated 6 months ago
- [ICML 2025] Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction☆82Updated 7 months ago
- An official implementation of "Catastrophic Failure of LLM Unlearning via Quantization" (ICLR 2025)☆35Updated 10 months ago
- The official github repo for "Diffusion Language Models are Super Data Learners".☆215Updated 2 months ago
- SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Model https://arxiv.org/pdf/2411.02433☆116Updated last year