torphix / infini-attentionLinks
Pytorch implementation of https://arxiv.org/html/2404.07143v1
☆21Updated last year
Alternatives and similar repositories for infini-attention
Users that are interested in infini-attention are comparing it to the libraries listed below
Sorting:
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆90Updated 11 months ago
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆114Updated 4 months ago
- ☆74Updated last year
- [EMNLP 2024] RWKV-CLIP: A Robust Vision-Language Representation Learner☆142Updated 4 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Updated last year
- mllm-npu: training multimodal large language models on Ascend NPUs☆92Updated last year
- Geometric-Mean Policy Optimization☆83Updated 2 weeks ago
- ☆84Updated 6 months ago
- Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines☆126Updated 11 months ago
- A collection of tricks and tools to speed up transformer models☆182Updated last week
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆137Updated last year
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture☆211Updated 9 months ago
- ☆35Updated 8 months ago
- DELT: Data Efficacy for Language Model Training☆39Updated last month
- Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI.☆121Updated last week
- MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning.☆236Updated 2 months ago
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation, arXiv 2024☆63Updated 3 weeks ago
- FuseAI Project☆87Updated 8 months ago
- Open-Pandora: On-the-fly Control Video Generation☆34Updated 10 months ago
- ☆185Updated 8 months ago
- ☆95Updated 10 months ago
- [ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”☆123Updated 9 months ago
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆37Updated last year
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆109Updated 4 months ago
- Code for paper "Patch-Level Training for Large Language Models"☆88Updated 11 months ago
- CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models (NeurIPS 2025)☆152Updated 3 weeks ago
- MiroTrain is an efficient and algorithm-first framework for post-training large agentic models.☆88Updated last month
- ☆101Updated 4 months ago
- Efficient Infinite Context Transformers with Infini-attention Pytorch Implementation + QwenMoE Implementation + Training Script + 1M cont…☆84Updated last year
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks (EMNLP'24)☆147Updated last year