kyegomez / MC-ViTView external linksLinks
Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"
☆27Jan 17, 2026Updated 3 weeks ago
Alternatives and similar repositories for MC-ViT
Users that are interested in MC-ViT are comparing it to the libraries listed below
Sorting:
- Code for our ACL 2025 paper "Language Repository for Long Video Understanding"☆34Jun 17, 2024Updated last year
- Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Mo…☆15Nov 11, 2024Updated last year
- Multi-Modal Multi-Embodied Hivemind-like Iteration of RTX-2☆15Jun 27, 2025Updated 7 months ago
- Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…☆17Nov 11, 2024Updated last year
- My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't rel…☆12Jan 29, 2024Updated 2 years ago
- Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta☆16Nov 11, 2024Updated last year
- Pytorch Code for "Unified Coarse-to-Fine Alignment for Video-Text Retrieval" (ICCV 2023)☆66Jun 7, 2024Updated last year
- ☆42Apr 7, 2024Updated last year
- Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".☆54Oct 21, 2025Updated 3 months ago
- rmp data ranking☆13Nov 4, 2025Updated 3 months ago
- Multi-model video-to-text by combining embeddings from Flan-T5 + CLIP + Whisper + SceneGraph. The 'backbone LLM' is pre-trained from scra…☆54Apr 21, 2023Updated 2 years ago
- [ICLR 2025] CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion☆55Jul 1, 2025Updated 7 months ago
- Implementation of MambaFormer in Pytorch ++ Zeta from the paper: "Can Mamba Learn How to Learn? A Comparative Study on In-Context Learnin…☆21Updated this week
- Graphlit Platform☆30Feb 20, 2024Updated last year
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Jan 31, 2026Updated 2 weeks ago
- ☆21May 11, 2025Updated 9 months ago
- A simpler Pytorch + Zeta Implementation of the paper: "SiMBA: Simplified Mamba-based Architecture for Vision and Multivariate Time series…☆28Nov 11, 2024Updated last year
- An experiment to see if we can process G2 reviews to extract topics from reviews☆10Feb 5, 2024Updated 2 years ago
- ☆17Sep 1, 2024Updated last year
- Community Open Source Implementation of GPT4o in PyTorch☆26Feb 9, 2026Updated last week
- Implementation of the "the first large-scale multimodal mixture of experts models." from the paper: "Multimodal Contrastive Learning with…☆36Jan 31, 2026Updated 2 weeks ago
- The official code of Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval (AAAI2024)☆32Mar 29, 2024Updated last year
- ☆12Sep 25, 2023Updated 2 years ago
- This repository contains the registries for components, agents and services, the second part of the autonolas-v1 protocol.☆15Updated this week
- Self hosted AI workflow for scraping Instagram Reels (audio and description). Extracting, summarising and categorising, then storing all …☆27Jan 10, 2026Updated last month
- High Security Surveillance Camera using OpenCV, Python & Arduino☆12Jun 20, 2020Updated 5 years ago
- Implementation of the model from "Faster sorting algorithms discovered using deep reinforcement learning" that discovered an all-new ult…☆11Aug 29, 2023Updated 2 years ago
- This project is an AI Recruitment System designed to accelerate the hiring process for HR and technical recruiters.☆14Jan 3, 2025Updated last year
- ☆13Apr 27, 2021Updated 4 years ago
- Finetuning & extending DiffusionDet to video & pedestrian multi-object-tracking☆13Apr 12, 2023Updated 2 years ago
- ☆37Oct 21, 2022Updated 3 years ago
- This is the official implementation of ICCV 2025 "Flash-VStream: Efficient Real-Time Understanding for Long Video Streams"☆269Oct 15, 2025Updated 4 months ago
- 【CVPR'24】OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition☆38Apr 27, 2024Updated last year
- Implementation of AutoRT: "AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents"☆42Nov 11, 2024Updated last year
- Implementation of a modular, high-performance, and simplistic mamba for high-speed applications☆40Nov 11, 2024Updated last year
- Code and software used to design de novo protein nanomachines. Supplementary material for "Computational design of nanoscale rotational m…☆10Mar 19, 2022Updated 3 years ago
- Script parses Interactive Brokers trade report to aid in Finnish tax report fill☆13Jan 10, 2024Updated 2 years ago
- Scaffold Prompting to promote LMMs☆46Dec 16, 2024Updated last year
- An Awesome, Feature Rich Discord Bot for Hosting and Managing CTF Challenges on Discord Written in Python3☆11Jun 29, 2024Updated last year