ByteDance-Seed / AHNLinks
AHN: Artificial Hippocampus Networks for Efficient Long-Context Modeling
☆161Updated 2 months ago
Alternatives and similar repositories for AHN
Users that are interested in AHN are comparing it to the libraries listed below
Sorting:
- ☆68Updated 2 months ago
- ☆84Updated 9 months ago
- Official Repository of Native Parallel Reasoner☆92Updated 3 weeks ago
- ☆184Updated 11 months ago
- The official repo of One RL to See Them All: Visual Triple Unified Reinforcement Learning☆329Updated 7 months ago
- Official code repository for Sketch-of-Thought (SoT)☆130Updated 8 months ago
- ☆78Updated 8 months ago
- [NeurIPS 2025] The official repo of SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond☆187Updated 6 months ago
- ☆126Updated this week
- Ring-V2 is a reasoning MoE LLM provided and open-sourced by InclusionAI.☆87Updated 2 months ago
- The code and data of We-Math, accepted by ACL 2025 main conference.☆134Updated last month
- The official repo for "OpenMoE 2: Sparse Diffusion Language Models".☆51Updated 2 weeks ago
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture☆212Updated last year
- [NeurIPS 2025] Thinkless: LLM Learns When to Think☆249Updated 3 months ago
- Easy and Efficient dLLM Fine-Tuning☆190Updated 3 weeks ago
- [NeurIPS 24 Spotlight] MaskLLM: Learnable Semi-structured Sparsity for Large Language Models☆183Updated last year
- ☆195Updated 2 weeks ago
- Landing repository for the paper "Predicting the Order of Upcoming Tokens Improves Language Modeling"☆41Updated 4 months ago
- ☆32Updated 5 months ago
- Sequential Diffusion Language Model (SDLM) enhances pre-trained autoregressive language models by adaptively determining generation lengt…☆85Updated 2 weeks ago
- The official repository of "R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Integration"☆134Updated 4 months ago
- official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"☆36Updated 11 months ago
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆118Updated 7 months ago
- LongRoPE is a novel method that can extends the context window of pre-trained LLMs to an impressive 2048k tokens.☆276Updated 2 months ago
- An official implementation of "CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning"☆169Updated 2 weeks ago
- SCOPE: Self-evolving Context Optimization via Prompt Evolution - A framework for automatic prompt optimization☆52Updated 3 weeks ago
- MiroTrain is an efficient and algorithm-first framework for post-training large agentic models.☆114Updated 4 months ago
- [NeurIPS'25 Oral] Query-agnostic KV cache eviction: 3–4× reduction in memory and 2× decrease in latency (Qwen3/2.5, Gemma3, LLaMA3)☆178Updated last week
- GPU-optimized framework for training diffusion language models at any scale. The backend of Quokka, Super Data Learners, and OpenMoE 2 tr…☆309Updated 2 months ago
- (NIPS 2025) OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Align…☆119Updated 2 months ago