sangminwoo / awesome-vision-and-languageLinks

A curated list of awesome vision and language resources (still under construction... stay tuned!)

☆550

Alternatives and similar repositories for awesome-vision-and-language

Users that are interested in awesome-vision-and-language are comparing it to the libraries listed below

Sorting:

yuewang-cuhk / awesome-vision-language-pretraining-papers
Recent Advances in Vision and Language PreTrained Models (VL-PTMs)
☆1,155Updated 3 years ago
phellonchen / awesome-Vision-and-Language-Pre-training
Recent Advances in Vision and Language Pre-training (VLP)
☆294Updated 2 years ago
TheShadow29 / awesome-grounding
awesome grounding: A curated list of research papers in visual grounding
☆1,114Updated last month
jokieleung / awesome-visual-question-answering
A curated list of Visual Question Answering(VQA)(Image/Video Question Answering),Visual Question Generation ,Visual Dialog ,Visual Common…
☆665Updated 2 years ago
Yutong-Zhou-cv / Awesome-Multimodality
A Survey on multimodal learning research.
☆333Updated 2 years ago
JindongGu / Awesome-Prompting-on-Vision-Language-Model
This repo lists relevant papers summarized in our survey paper: A Systematic Survey of Prompt Engineering on Vision-Language Foundation …
☆491Updated 7 months ago
awaisrauf / Awesome-CV-Foundational-Models
☆528Updated 11 months ago
danieljf24 / awesome-video-text-retrieval
A curated list of deep learning resources for video-text retrieval.
☆631Updated 2 years ago
junchen14 / Multi-Modal-Transformer
The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-languag…
☆229Updated 3 years ago
Computer-Vision-in-the-Wild / CVinW_Readings
A collection of papers on the topic of ``Computer Vision in the Wild (CVinW)''
☆1,339Updated last year
ttengwang / Awesome_Prompting_Papers_in_Computer_Vision
A curated list of prompt-based paper in computer vision and vision-language learning.
☆925Updated last year
microsoft / scene_graph_benchmark
image scene graph generation benchmark
☆398Updated 3 years ago
drmuskangarg / Multimodal-datasets
This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and …
☆309Updated 3 years ago
j-min / VL-T5
PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
☆372Updated 2 years ago
salaniz / pycocoevalcap
Python 3 support for the MS COCO caption evaluation tools
☆327Updated last year
yzhuoning / Awesome-CLIP
Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
☆1,218Updated last year
clip-vil / CLIP-ViL
[ICLR 2022] code for "How Much Can CLIP Benefit Vision-and-Language Tasks?" https://arxiv.org/abs/2107.06383
☆415Updated 2 years ago
DirtyHarryLYL / LLM-in-Vision
Recent LLM-based CV and related works. Welcome to comment/contribute!
☆873Updated 7 months ago
zdou0830 / METER
METER: A Multimodal End-to-end TransformER Framework
☆373Updated 2 years ago
microsoft / GenerativeImage2Text
GIT: A Generative Image-to-text Transformer for Vision and Language
☆575Updated last year
yikuan8 / Transformers-VQA
An implementation that downstreams pre-trained V+L models to VQA tasks. Now support: VisualBERT, LXMERT, and UNITER
☆165Updated 2 years ago
vacancy / SceneGraphParser
A python toolkit for parsing captions (in natural language) into scene graphs (as symbolic representations).
☆587Updated last year
HenryHZY / Awesome-Multimodal-LLM
Research Trends in LLM-guided Multimodal Learning.
☆355Updated 2 years ago
MarkMoHR / Awesome-Referring-Image-Segmentation
A collection of papers about Referring Image Segmentation.
☆774Updated this week
uncbiag / Awesome-Foundation-Models
A curated list of foundation models for vision and language tasks
☆1,102Updated 4 months ago
forence / Awesome-Visual-Captioning
This repository focus on Image Captioning & Video Captioning & Seq-to-Seq Learning & NLP
☆414Updated 2 years ago
Eurus-Holmes / Awesome-Multimodal-Research
A curated list of Multimodal Related Research.
☆1,372Updated 2 years ago
DirtyHarryLYL / Transformer-in-Vision
Recent Transformer-based CV and related works.
☆1,333Updated 2 years ago
google-deepmind / perception_test
☆235Updated 4 months ago
Taaccoo / awesome-vqa-latest
Visual Question Answering Paper List.
☆53Updated 3 years ago