☆93Sep 10, 2025Updated 6 months ago
Alternatives and similar repositories for memory-r1
Users that are interested in memory-r1 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The official implementation of the paper "Mem-α: Learning Memory Construction via Reinforcement Learning"☆190Dec 25, 2025Updated 3 months ago
- Official Implementation for [ICLR26] DefensiveKV: Taming the Fragility of KV Cache Eviction in LLM Inference☆31Mar 19, 2026Updated last week
- TOKEN-IMPORTANCE GUIDED DIRECT PREFERENCE OPTIMIZATION☆24Jan 26, 2026Updated 2 months ago
- Code repo for "CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMs".☆16Sep 15, 2024Updated last year
- ☆34Nov 26, 2025Updated 4 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆12Apr 29, 2024Updated last year
- ☆10Jul 19, 2021Updated 4 years ago
- An official PyTorch implementation of "Certifiably Robust Graph Contrastive Learning" (NeurIPS 2023)☆11Jan 22, 2024Updated 2 years ago
- ☆46Mar 15, 2025Updated last year
- Official code repo for paper "Great Memory, Shallow Reasoning: Limits of kNN-LMs"☆23Apr 30, 2025Updated 10 months ago
- ☆18Dec 2, 2024Updated last year
- Emergent Hierarchical Reasoning in LLMs/VLMs through Reinforcement Learning☆62Oct 24, 2025Updated 5 months ago
- ☆39Oct 16, 2025Updated 5 months ago
- ☆18Jun 28, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [NeurIPS 2023] Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective☆41Oct 17, 2023Updated 2 years ago
- ☆32Oct 4, 2025Updated 5 months ago
- User-friendly implementation of the Mixture-of-Sparse-Attention (MoSA). MoSA selects distinct tokens for each head with expert choice rou…☆28May 3, 2025Updated 10 months ago
- Offical implementation of our paper "Exploring the Potential of Diffusion Large Language Models in Code Generation".☆20Oct 29, 2025Updated 5 months ago
- The Official Implementation of Ada-KV [NeurIPS 2025]☆128Nov 26, 2025Updated 4 months ago
- Incorporating the memory mechanism into the transformer and employing a parallel weighting structure to obtain a better utterance-level r…☆22Oct 4, 2025Updated 5 months ago
- The repository for papaer "Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs"☆14Dec 16, 2024Updated last year
- DOMAINEVAL is an auto-constructed benchmark for multi-domain code generation that consists of 2k+ subjects (i.e., description, reference …☆14Dec 12, 2024Updated last year
- [ICML 2025] LaCache: Ladder-Shaped KV Caching for Efficient Long-Context Modeling of Large Language Models☆18Nov 4, 2025Updated 4 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for the preprint "Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?"☆48Jul 29, 2025Updated 8 months ago
- [CVPR 2026] Variation-aware Vision Token Dropping for Faster Large Vision-Language Models☆28Mar 18, 2026Updated last week
- Gaussian Embedding of Large-scale Attributed Graphs☆10Mar 13, 2020Updated 6 years ago
- ☆12Sep 12, 2024Updated last year
- ☆16Nov 26, 2024Updated last year
- In this recruiting competition, Winton challenges you to take on the very difficult task of predicting the future (stock returns).☆13Feb 10, 2020Updated 6 years ago
- ☆23Apr 26, 2021Updated 4 years ago
- ☆10Aug 13, 2021Updated 4 years ago
- [COLM 2025] Official code for "When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoni…☆15Oct 31, 2025Updated 4 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆144Updated this week
- 🫧 Code for Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data (Maekawa*, Iso* et al.…☆12Feb 25, 2025Updated last year
- Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.☆22Jul 18, 2025Updated 8 months ago
- code for "Deep Learning for Sequential Recommendation: Algorithms, Influential Factors, and Evaluations"☆12Sep 7, 2020Updated 5 years ago
- Memory Mosaics are networks of associative memories working in concert to achieve a prediction task.☆61Jan 30, 2025Updated last year
- MARNNs Can Learn Generalized Dyck Languages☆12Nov 11, 2019Updated 6 years ago
- CJOBS☆15Dec 22, 2018Updated 7 years ago