facebookresearch / GliTr
GliTr Glimpse Transformers with Spatiotemporal Consistency for Online Action Prediction
☆25Updated last year
Alternatives and similar repositories for GliTr:
Users that are interested in GliTr are comparing it to the libraries listed below
- Official repository for the paper "End-to-End Visual Editing with a Generatively Pre-Trained Artist", which is accepted at ECCV 2022. Her…☆29Updated 2 years ago
- Code release for the CVPR'23 paper titled "PartDistillation Learning part from Instance Segmentation"☆59Updated last year
- Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch☆98Updated last year
- Implementation for the CVPR 2023 paper "Improving Selective Visual Question Answering by Learning from Your Peers" (https://arxiv.org/abs…☆24Updated last year
- Repository for the paper Do SSL Models Have Déjà Vu? A Case of Unintended Memorization in Self-supervised Learning☆37Updated last year
- [CVPR 2023] HierVL Learning Hierarchical Video-Language Embeddings☆45Updated last year
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆55Updated 4 months ago
- SAM-CLIP module for use with Autodistill.☆13Updated last year
- (ICLR 2024, CVPR 2024) SparseFormer☆70Updated 2 months ago
- This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision☆36Updated last year
- Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models".☆99Updated 7 months ago
- ☆51Updated 7 months ago
- ☆58Updated 10 months ago
- Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"☆100Updated 4 months ago
- A light-weight implementation of ICCV2023 paper "Reinforce Data, Multiply Impact: Improved Model Accuracy and Robustness with Dataset Rei…☆79Updated last year
- Code for “Pretrained Language Models as Visual Planners for Human Assistance”☆59Updated last year
- Timm model explorer☆36Updated 9 months ago
- Un-*** 50 billions multimodality dataset☆24Updated 2 years ago
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆21Updated 6 months ago
- How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges☆30Updated last year
- A JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short.☆48Updated 7 months ago
- companion code for "Learning to substitute Ingredients in Recipes"☆26Updated last year
- Codebase for adaptive continual memory☆13Updated last year
- ☆28Updated last year
- Library for the Test-based Calibration Error (TCE) metric to quantify the degree to classifier calibration.☆13Updated last year
- PyTorch implementation of R-MAE https//arxiv.org/abs/2306.05411☆109Updated last year
- Official repository for the General Robust Image Task (GRIT) Benchmark☆51Updated last year
- Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"☆19Updated this week
- ☆12Updated 5 months ago
- Python Tools for Visual Dataset Transformation☆26Updated last month