lucidrains / LVMAE-pytorchView external linksLinks
Implementation of the proposed LVMAE, from the paper, Extending Video Masked Autoencoders to 128 frames, in Pytorch
☆55Nov 25, 2024Updated last year
Alternatives and similar repositories for LVMAE-pytorch
Users that are interested in LVMAE-pytorch are comparing it to the libraries listed below
Sorting:
- Implementation of the proposed MaskBit from Bytedance AI☆83Nov 12, 2024Updated last year
- Implementation of the proposed Spline-Based Transformer from Disney Research☆105Nov 9, 2024Updated last year
- ☆14Dec 11, 2024Updated last year
- An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbols☆16Aug 3, 2021Updated 4 years ago
- Attempt to make multiple residual streams from Bytedance's Hyper-Connections paper accessible to the public☆172Feb 4, 2026Updated last week
- Implementation of Autoregressive Diffusion in Pytorch☆432Dec 4, 2025Updated 2 months ago
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆27Jan 17, 2026Updated last month
- Just a repository that will house some MLPs and their variants, so to avoid having to reimplement them again and again for different proj…☆45Jan 29, 2026Updated 2 weeks ago
- Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"☆182Jun 20, 2024Updated last year
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generation☆90Oct 11, 2024Updated last year
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆207Aug 26, 2023Updated 2 years ago
- Limit Order Book for high-frequency trading (HFT) strategies using data science approaches☆23Dec 12, 2021Updated 4 years ago
- [ICLR 2026] SparseD: Sparse Attention for Diffusion Language Models☆57Oct 7, 2025Updated 4 months ago
- Implementation of Insertion-deletion Denoising Diffusion Probabilistic Models☆30May 31, 2022Updated 3 years ago
- Explorations into the recently proposed Taylor Series Linear Attention☆100Aug 18, 2024Updated last year
- An unofficial implementation of "UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding".☆26Nov 4, 2023Updated 2 years ago
- quick playground to animate pippin☆14Nov 11, 2024Updated last year
- Implementation of the Triangle Multiplicative module, used in Alphafold2 as an efficient way to mix rows or columns of a 2d feature map, …☆39Aug 3, 2021Updated 4 years ago
- The original Shared Recurrent Memory Transformer implementation☆33Jul 11, 2025Updated 7 months ago
- VLG-Net: Video-Language Graph Matching Networks for Video Grounding☆31May 31, 2022Updated 3 years ago
- Official code of paper "PGT: A Progressive Method for Training Models on Long Videos" on CVPR2021☆31Mar 30, 2021Updated 4 years ago
- ☆33Jan 6, 2025Updated last year
- Implementation of the proposed Adam-atan2 from Google Deepmind in Pytorch☆135Oct 15, 2025Updated 4 months ago
- The Best of Both Worlds: Integrating Language Models and Diffusion Models for Video Generation☆39May 4, 2025Updated 9 months ago
- Weakly Supervised Video Moment Localisation with Contrastive Negative Sample Mining☆30Apr 4, 2022Updated 3 years ago
- My explorations into editing the knowledge and memories of an attention network☆35Dec 8, 2022Updated 3 years ago
- ☆71Nov 18, 2024Updated last year
- Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI☆1,326Jan 27, 2026Updated 3 weeks ago
- A red teaming agent☆18Oct 15, 2025Updated 4 months ago
- DiffusionWithAutoscaler☆29Apr 2, 2024Updated last year
- The official repo of continuous speculative decoding☆31Mar 28, 2025Updated 10 months ago
- ☆10Dec 25, 2022Updated 3 years ago
- [WACV2025] Official PyTorch implementation of TrackDiffusion (https://arxiv.org/abs/2312.00651)☆80Jun 26, 2024Updated last year
- Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images and video without cascading networks, in P…☆207Feb 14, 2024Updated 2 years ago
- A suite of image and video neural tokenizers☆1,706Feb 11, 2025Updated last year
- The 2D discrete wavelet transform for JAX☆45Feb 28, 2023Updated 2 years ago
- Structured Video Comprehension of Real-World Shorts☆229Sep 21, 2025Updated 4 months ago
- Implementation of Soft MoE, proposed by Brain's Vision team, in Pytorch☆344Apr 2, 2025Updated 10 months ago
- Ship remote sensing dataset☆11Jun 28, 2022Updated 3 years ago