salesforce / burn-after-reading
☆13Updated 2 years ago
Alternatives and similar repositories for burn-after-reading:
Users that are interested in burn-after-reading are comparing it to the libraries listed below
- ☆32Updated 3 years ago
- ☆88Updated last year
- Official repository for the General Robust Image Task (GRIT) Benchmark☆54Updated 2 years ago
- codebase for the SIMAT dataset and evaluation☆39Updated 3 years ago
- Un-*** 50 billions multimodality dataset☆24Updated 2 years ago
- Tools for content datamining and NLP at scale☆43Updated 10 months ago
- Command-line tool for downloading and extending the RedCaps dataset.☆46Updated last year
- ☆64Updated last year
- ☆85Updated last year
- ☆44Updated 3 years ago
- GPU controlled Hetzner Cloud workers swarm for Crawling@Home project☆54Updated 2 years ago
- Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities☆78Updated 3 years ago
- ☆64Updated 3 months ago
- Multimodal-Procedural-Planning☆92Updated last year
- A Data Source for Reasoning Embodied Agents☆19Updated last year
- An official codebase for paper " CHAMPAGNE: Learning Real-world Conversation from Large-Scale Web Videos (ICCV 23)"☆52Updated last year
- Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training☆137Updated 2 years ago
- ☆22Updated 2 years ago
- Data and code for NeurIPS 2021 Paper "IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning".☆52Updated last year
- This repo contains data and code for the paper "Reasoning over Public and Private Data in Retrieval-Based Systems."☆46Updated 9 months ago
- ☆24Updated 3 years ago
- This dataset contains about 110k images annotated with the depth and occlusion relationships between arbitrary objects. It enables resear…☆16Updated 3 years ago
- Research code for "Training Vision-Language Transformers from Captions Alone"☆34Updated 2 years ago
- A Unified Framework for Video-Language Understanding☆57Updated last year
- Official implementation of "Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models"☆36Updated last year
- A task-agnostic vision-language architecture as a step towards General Purpose Vision☆92Updated 3 years ago
- [BMVC22] Official Implementation of ViCHA: "Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment"☆55Updated 2 years ago
- ☆35Updated last year
- kdexd/coco-caption@de6f385☆26Updated 5 years ago
- In-the-wild Question Answering☆15Updated last year