facebookresearch / video_rep_learning
SSL Video Representation Learning project
☆10Updated last year
Related projects ⓘ
Alternatives and complementary repositories for video_rep_learning
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆16Updated 2 weeks ago
- A benchmark dataset and simple code examples for measuring the perception and reasoning of multi-sensor Vision Language models.☆16Updated last month
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆30Updated 4 months ago
- Directed masked autoencoders☆14Updated last year
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Updated 8 months ago
- Experimental scripts for researching data adaptive learning rate scheduling.☆23Updated last year
- ☆28Updated 2 weeks ago
- ☆23Updated 5 months ago
- The official repo of continuous speculative decoding☆18Updated this week
- Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…☆16Updated 2 weeks ago
- ☆12Updated 2 months ago
- Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents (COLM 2024)☆25Updated 4 months ago
- Official repository for the paper "Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules" (ICLR 2023)☆12Updated last year
- Open source community's implementation of the model from "LANGUAGE MODEL BEATS DIFFUSION — TOKENIZER IS KEY TO VISUAL GENERATION"☆15Updated 2 weeks ago
- Un-*** 50 billions multimodality dataset☆24Updated 2 years ago
- ☆24Updated last year
- ☆13Updated 2 weeks ago
- Code for "Are “Hierarchical” Visual Representations Hierarchical?" in NeurIPS Workshop for Symmetry and Geometry in Neural Representation…☆19Updated last year
- This repository includes the official implementation of our paper "Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness …☆19Updated last year
- Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant …☆14Updated 8 months ago
- Official implementation of ECCV24 paper: POA☆24Updated 3 months ago
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Updated last year
- This is the implementation of CounterCurate, the data curation pipeline of both physical and semantic counterfactual image-caption pairs.☆17Updated 4 months ago
- Official implementation for Sparse MetA-Tuning (SMAT)☆14Updated 4 months ago
- Official repo for the TMLR paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"☆27Updated 6 months ago
- Pytorch Implementation of the Model from "MIRASOL3B: A MULTIMODAL AUTOREGRESSIVE MODEL FOR TIME-ALIGNED AND CONTEXTUAL MODALITIES"☆24Updated 2 weeks ago
- SIEVE: Multimodal Dataset Pruning using Image-Captioning Models (CVPR 2024)☆14Updated 6 months ago
- ☆19Updated last month
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆32Updated 8 months ago