zinengtang/Perceiver_VL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zinengtang/Perceiver_VL)

zinengtang / Perceiver_VL

PyTorch code for "Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention" (WACV 2023)

☆34

Alternatives and similar repositories for Perceiver_VL

Users that are interested in Perceiver_VL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zinengtang / DeCEMBERT
View on GitHub
Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)
☆17Jan 12, 2023Updated 3 years ago
zinengtang / TVLT
View on GitHub
PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2022 Oral)
☆127Feb 24, 2023Updated 3 years ago
jayleicn / mTVRetrieval
View on GitHub
[ACL 2021] mTVR: Multilingual Video Moment Retrieval
☆27Aug 20, 2022Updated 3 years ago
rowanz / merlot_reserve
View on GitHub
Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"
☆146Jun 1, 2022Updated 4 years ago
tsujuifu / pytorch_violet
View on GitHub
A PyTorch implementation of VIOLET
☆138Dec 17, 2023Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
gurkirt / preprocess-activityNet
View on GitHub
Preprocess the activityNet dataset for detection task
☆13Mar 3, 2017Updated 9 years ago
MikeWangWZHL / Paxion
View on GitHub
Repo for paper: "Paxion: Patching Action Knowledge in Video-Language Foundation Models" Neurips 23 Spotlight
☆38May 23, 2023Updated 3 years ago
dialogtekgeek / AVSD-DSTC10_Official
View on GitHub
Audio Visual Scene-Aware Dialog (AVSD) Challenge at the 10th Dialog System Technology Challenge (DSTC)
☆27Aug 19, 2022Updated 3 years ago
ych133 / How2R-and-How2QA
View on GitHub
A video retrieval dataset How2R and a video QA dataset How2QA
☆24Oct 15, 2020Updated 5 years ago
mhh0318 / UniD3
View on GitHub
☆55Feb 9, 2023Updated 3 years ago
j-min / VL-T5
View on GitHub
PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
☆372Jul 29, 2023Updated 2 years ago
jason9693 / FROZEN
View on GitHub
☆14May 3, 2022Updated 4 years ago
tejas-gokhale / vqa_mutant
View on GitHub
☆13Feb 14, 2022Updated 4 years ago
VALUE-Leaderboard / StarterCode
View on GitHub
Starter Code for VALUE benchmark
☆79Aug 23, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
salesforce / ALPRO
View on GitHub
Align and Prompt: Video-and-Language Pre-training with Entity Prompts
☆188May 1, 2025Updated last year
paulchhuang / 4DViewerBlender
View on GitHub
use Blender software to visualize mesh sequences
☆24Sep 2, 2019Updated 6 years ago
facebookresearch / VLaMP
View on GitHub
Code for “Pretrained Language Models as Visual Planners for Human Assistance”
☆64Jun 12, 2023Updated 3 years ago
zinengtang / VidLanKD
View on GitHub
Pytorch version of VidLanKD: Improving Language Understanding viaVideo-Distilled Knowledge Transfer (NeurIPS 2021))
☆56Feb 6, 2023Updated 3 years ago
rowanz / merlot
View on GitHub
MERLOT: Multimodal Neural Script Knowledge Models
☆226Mar 15, 2022Updated 4 years ago
yuhangzang / UPT
View on GitHub
☆61May 2, 2025Updated last year
klauscc / VindLU
View on GitHub
☆109Dec 23, 2022Updated 3 years ago
airsplay / vimpac
View on GitHub
☆73Jun 3, 2022Updated 4 years ago
shiquanyang / NS-Dial
View on GitHub
An Interpretable Neuro-Symbolic Framework for Task-Oriented Dialogue Generation
☆23Mar 6, 2022Updated 4 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
VALUE-Leaderboard / DataRelease
View on GitHub
Data Release for VALUE Benchmark
☆30Feb 16, 2022Updated 4 years ago
antoyang / FrozenBiLM
View on GitHub
[NeurIPS 2022] Zero-Shot Video Question Answering via Frozen Bidirectional Language Models
☆159Dec 9, 2024Updated last year
ArrowLuo / VideoFeatureExtractor
View on GitHub
Video Feature Extractor for S3D-HowTo100M
☆29Apr 30, 2021Updated 5 years ago
theeluwin / sci-news-sum-kr-50
View on GitHub
네이버 뉴스 중 IT/과학 분야에서 50개를 선정해서 요약에 해당하는 문장을 태깅해둔 데이터셋입니다.
☆40Nov 23, 2016Updated 9 years ago
jayleicn / singularity
View on GitHub
[ACL 2023] Official PyTorch code for Singularity model in "Revealing Single Frame Bias for Video-and-Language Learning"
☆136May 5, 2023Updated 3 years ago
ylsung / Ladder-Side-Tuning
View on GitHub
PyTorch codes for "LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning"
☆241Jan 20, 2023Updated 3 years ago
chemicaltree / tetra
View on GitHub
☆10Sep 14, 2022Updated 3 years ago
liziliao / MMConv
View on GitHub
Official repository for "MMConv: An Environment for Multimodal Conversational Search across Multiple Domains"
☆34Jul 15, 2021Updated 5 years ago
zmykevin / UC2
View on GitHub
CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training
☆34Nov 9, 2021Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
jason9693 / polyglot-finetuning-oslo
View on GitHub
☆19Sep 20, 2022Updated 3 years ago
thunlp / PEVL
View on GitHub
Source code for EMNLP 2022 paper “PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models”
☆49Nov 10, 2022Updated 3 years ago
Adit31 / Captionomaly-Deep-Learning-Toolbox-for-Anomaly-Captioning
View on GitHub
Source Code for Captionomaly: A Deep Learning Toolbox for Anomaly Captioning in Surveillance Videos
☆13Jun 26, 2023Updated 3 years ago
dirkweissenborn / qa_network
View on GitHub
Implementation of QA Networks
☆10Jul 14, 2016Updated 10 years ago
KimHyeonwoo / go-hangul
View on GitHub
A package for Hangul (korean alphabet)
☆13Dec 19, 2022Updated 3 years ago
Zhongying-Deng / DAC-Net
View on GitHub
Pytorch implementation of DAC-Net ("Zhongying Deng, Kaiyang Zhou, Yongxin Yang, Tao Xiang. Domain Attention Consistency for Multi-Source …
☆24Dec 13, 2021Updated 4 years ago
yixiaoer / tpu-training-example
View on GitHub
☆16Jul 8, 2024Updated 2 years ago