☆47Apr 29, 2024Updated last year
Alternatives and similar repositories for noc
Users that are interested in noc are comparing it to the libraries listed below
Sorting:
- Repository for the paper "Data Efficient Masked Language Modeling for Vision and Language".☆18Sep 17, 2021Updated 4 years ago
- Official implementation of TCL (CVPR 2023)☆120May 11, 2023Updated 2 years ago
- Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)☆17Jan 12, 2023Updated 3 years ago
- [ICLR2024] The official implementation of paper "UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling", by …☆77Jan 27, 2024Updated 2 years ago
- ☆19Nov 22, 2022Updated 3 years ago
- PyTorch Implementation of Sparse DETR☆176Jan 3, 2024Updated 2 years ago
- Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners☆116Sep 15, 2022Updated 3 years ago
- Source code of "Leaky Thoughts: Large Reasoning Models Are Not Private Thinkers" EMNLP 2025☆16Jan 12, 2026Updated last month
- A Simple Framwork for CV Pre-training Model (SOCO, VirTex, BEiT)☆15Oct 18, 2021Updated 4 years ago
- [ICLR 2022] "As-ViT: Auto-scaling Vision Transformers without Training" by Wuyang Chen, Wei Huang, Xianzhi Du, Xiaodan Song, Zhangyang Wa…☆76Feb 21, 2022Updated 4 years ago
- Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning☆12Aug 23, 2025Updated 6 months ago
- ☆18Apr 16, 2025Updated 10 months ago
- [ECCV 2024] Official PyTorch implementation of LUT "Learning with Unmasked Tokens Drives Stronger Vision Learners"☆13Dec 1, 2024Updated last year
- GPT as Knowledger Worker (or if you really want, GPT Sorta' Takes the CPA Exam)☆13Jan 24, 2023Updated 3 years ago
- Code for the paper "Understanding and Evaluating Racial Biases in Image Captioning"☆12Oct 19, 2021Updated 4 years ago
- ☆73Jun 3, 2022Updated 3 years ago
- MDMMT: Multidomain Multimodal Transformer for Video Retrieval☆26Jun 28, 2021Updated 4 years ago
- Research code for CVPR 2022 paper: "EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching"☆26Oct 20, 2022Updated 3 years ago
- Locally Hierarchical Auto-Regressive Modeling for Image Generation (HQ-Transformer)☆28Feb 14, 2024Updated 2 years ago
- ☆85Dec 4, 2022Updated 3 years ago
- Learning Debiased and Disentangled Representations for Semantic Segmentation (NeurIPS 2021)☆13Jan 23, 2022Updated 4 years ago
- A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering☆17Jan 7, 2023Updated 3 years ago
- [EMNLP 2022] Distillation-Resistant Watermarking (DRW) for Model Protection in NLP☆13Aug 17, 2023Updated 2 years ago
- Code for DVD A Diagnostic Dataset for Multi-step Reasoning in Video Grounded Dialogue☆14Oct 12, 2021Updated 4 years ago
- PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)☆209Dec 18, 2022Updated 3 years ago
- MLPs for Vision and Langauge Modeling (Coming Soon)☆27Dec 9, 2021Updated 4 years ago
- Patching open-vocabulary models by interpolating weights☆91Sep 28, 2023Updated 2 years ago
- ☆33Aug 16, 2021Updated 4 years ago
- ☆17Oct 18, 2022Updated 3 years ago
- ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis☆126Mar 14, 2022Updated 3 years ago
- Code for the paper "Controllable Video Captioning with an Exemplar Sentence"☆12Apr 14, 2021Updated 4 years ago
- ☆11Dec 8, 2022Updated 3 years ago
- Data Release for VALUE Benchmark☆30Feb 16, 2022Updated 4 years ago
- Repo for "Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks" ACL 2023 Findings☆15May 3, 2023Updated 2 years ago
- Dataset for Bilingual VLN☆11Dec 5, 2020Updated 5 years ago
- [ECCV 2022] GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval☆17Aug 24, 2022Updated 3 years ago
- CapDec: SOTA Zero Shot Image Captioning Using CLIP and GPT2, EMNLP 2022 (findings)☆203Jan 28, 2024Updated 2 years ago
- End-to-end Multi-modal Video Temporal Grounding, NeurIPS 2021☆18Oct 24, 2021Updated 4 years ago
- Code and dataset for NAACL 2022 paper "CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination" Hyounghun Kim, Abhay Zala, Mohi…☆16Nov 26, 2022Updated 3 years ago