ai-in-pm / Titans---Learning-to-Memorize-at-Test-TimeLinks
Titans - Learning to Memorize at Test Time
☆61Updated last year
Alternatives and similar repositories for Titans---Learning-to-Memorize-at-Test-Time
Users that are interested in Titans---Learning-to-Memorize-at-Test-Time are comparing it to the libraries listed below
Sorting:
- Inference Speed Benchmark for Learning to (Learn at Test Time): RNNs with Expressive Hidden States☆81Updated last year
- ☆51Updated 8 months ago
- Official PyTorch Implementation for Vision-Language Models Create Cross-Modal Task Representations, ICML 2025☆31Updated 9 months ago
- PyTorch implementation of Titans.☆31Updated last year
- [ICLR 2026] Geometric-Mean Policy Optimization☆98Updated last week
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Updated last year
- MobileLLM-R1☆75Updated 4 months ago
- [ICML 2025] Code for "R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts"☆17Updated 10 months ago
- This repo contains the source code for VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks (NeurIPS 2024).☆42Updated last year
- [ICLR 2025] Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization☆23Updated 3 months ago
- Model Merging with Functional Dual Anchors☆45Updated 2 months ago
- [ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retenti…☆67Updated last year
- [NeurIPS 2024] Official Repository of The Mamba in the Llama: Distilling and Accelerating Hybrid Models☆236Updated 3 months ago
- [NeurIPS 2025] Official implementation of "Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning"☆29Updated 3 months ago
- Defeating the Training-Inference Mismatch via FP16☆180Updated 2 months ago
- ☆73Updated 7 months ago
- M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models☆47Updated 6 months ago
- [NeurIPS'25 Spotlight🔥] Official Implementation of RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness☆56Updated last month
- ☆70Updated last year
- ☆169Updated 4 months ago
- [NeurIPS'25] dKV-Cache: The Cache for Diffusion Language Models☆129Updated 8 months ago
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆40Updated last year
- Resa: Transparent Reasoning Models via SAEs☆47Updated 4 months ago
- The official github repo for "Diffusion Language Models are Super Data Learners".☆219Updated 2 months ago
- Official repo of paper LM2☆46Updated 11 months ago
- This project implements the Titans architecture from the paper "Titans: Learning to Memorize at Test Time" for market data prediction.☆11Updated last year
- Code for "RSQ: Learning from Important Tokens Leads to Better Quantized LLMs"☆20Updated 7 months ago
- [ICLR 2025 Spotlight] Official Implementation for ToST (Token Statistics Transformer)☆131Updated 11 months ago
- [ICLR 2026] RPG: KL-Regularized Policy Gradient (https://arxiv.org/abs/2505.17508)☆64Updated last week
- [ICLR2025] Codebase for "ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing", built on Megatron-LM.☆105Updated last year