facebookresearch / GliTr
GliTr Glimpse Transformers with Spatiotemporal Consistency for Online Action Prediction
☆25Updated last year
Related projects ⓘ
Alternatives and complementary repositories for GliTr
- Code for “Pretrained Language Models as Visual Planners for Human Assistance”☆57Updated last year
- Code release for the CVPR'23 paper titled "PartDistillation Learning part from Instance Segmentation"☆59Updated 11 months ago
- Official repository for the paper "End-to-End Visual Editing with a Generatively Pre-Trained Artist", which is accepted at ECCV 2022. Her…☆29Updated last year
- Repository for the paper: "TiC-CLIP: Continual Training of CLIP Models".☆95Updated 5 months ago
- [ECCV2024][ICCV2023] Official PyTorch implementation of SeiT++ and SeiT☆51Updated 3 months ago
- Load any clip model with a standardized interface☆21Updated 7 months ago
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmind☆53Updated 2 months ago
- Timm model explorer☆36Updated 7 months ago
- Repository for the paper Do SSL Models Have Déjà Vu? A Case of Unintended Memorization in Self-supervised Learning☆37Updated last year
- companion code for "Learning to substitute Ingredients in Recipes"☆26Updated last year
- Un-*** 50 billions multimodality dataset☆24Updated 2 years ago
- Implementation of "PaLM2-VAdapter:" from the multi-modal model paper: "PaLM2-VAdapter: Progressively Aligned Language Model Makes a Stron…☆16Updated 2 weeks ago
- Implementation for the CVPR 2023 paper "Improving Selective Visual Question Answering by Learning from Your Peers" (https://arxiv.org/abs…☆24Updated last year
- Video descriptions of research papers relating to foundation models and scaling☆30Updated last year
- [NeurIPS 2022] code for "K-LITE: Learning Transferable Visual Models with External Knowledge" https://arxiv.org/abs/2204.09222☆51Updated last year
- Implementation of MaMMUT, a simple vision-encoder text-decoder architecture for multimodal tasks from Google, in Pytorch☆97Updated last year
- ☆64Updated last year
- A JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short.☆48Updated 5 months ago
- This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision☆37Updated last year
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆19Updated 3 months ago
- A curated list of Survey Papers on Deep Learning.☆10Updated last year
- Official repository for the General Robust Image Task (GRIT) Benchmark☆50Updated last year
- Directed masked autoencoders☆14Updated last year
- More dimensions = More fun☆21Updated 3 months ago
- Code and models for the paper "The effectiveness of MAE pre-pretraining for billion-scale pretraining" https://arxiv.org/abs/2303.13496☆81Updated 4 months ago
- Library for the Test-based Calibration Error (TCE) metric to quantify the degree to classifier calibration.☆13Updated last year
- Implementation of some personal helper functions for Einops, my most favorite tensor manipulation library ❤️☆52Updated last year
- FID computation in Jax/Flax.☆24Updated 4 months ago