MILVLG/mmnas

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MILVLG/mmnas)

MILVLG / mmnas

Deep Multimodal Neural Architecture Search

☆29

Alternatives and similar repositories for mmnas

Users that are interested in mmnas are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MILVLG / mt-captioning
View on GitHub
A PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning
☆25Sep 4, 2020Updated 5 years ago
MILVLG / openvqa
View on GitHub
A lightweight, scalable, and general framework for visual question answering research
☆334Sep 3, 2021Updated 4 years ago
MILVLG / rosita
View on GitHub
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
☆57Jun 13, 2023Updated 3 years ago
HLR / Cross_Modality_Relevance
View on GitHub
The source code of ACL 2020 paper: "Cross-Modality Relevance for Reasoning on Language and Vision"
☆27May 6, 2021Updated 5 years ago
MILVLG / bottom-up-attention.pytorch
View on GitHub
A PyTorch reimplementation of bottom-up-attention models
☆302Apr 7, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
sergiotasconmorales / consistency_vqa
View on GitHub
Repository of paper Consistency-preserving Visual Question Answering in Medical Imaging (MICCAI2022)
☆26Mar 28, 2023Updated 3 years ago
HDUMIL / summer-school
View on GitHub
Media Intelligence Laboratory Machine Learning / Deep Learning Summer School
☆17Oct 1, 2019Updated 6 years ago
Somedaywilldo / BM-NAS
View on GitHub
BM-NAS: Bilevel Multimodal Neural Architecture Search (AAAI 2022 Oral)
☆20Dec 6, 2022Updated 3 years ago
facebookresearch / grid-feats-vqa
View on GitHub
Grid features pre-training code for visual question answering
☆269Sep 17, 2021Updated 4 years ago
CrossmodalGroup / SSL-VQA
View on GitHub
Code for our IJCAI2020 paper: Overcoming Language Priors with Self-supervised Learning for Visual Question Answering
☆52Aug 21, 2020Updated 5 years ago
Demfier / philo
View on GitHub
Philo: uniting modalities. A repository with adaptive fusion techniques for multimodal data
☆26Mar 16, 2025Updated last year
YuJiang01 / n2nmn_pytorch
View on GitHub
implement n2nmn with pytorch
☆19Apr 10, 2019Updated 7 years ago
vuhoangminh / vqa_medical
View on GitHub
☆10Oct 20, 2022Updated 3 years ago
tae898 / multimodal-datasets
View on GitHub
Multimodal datasets.
☆34Jan 26, 2024Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
ZihaoWang-CV / CAMP_iccv19
View on GitHub
CAMP: Cross-Modal Adaptive Message Passing for Text-Image Retrieval
☆126Feb 26, 2020Updated 6 years ago
tihbe / python-ebdataset
View on GitHub
An event based dataset loader under one common python API.
☆10Mar 22, 2022Updated 4 years ago
scascin0 / alphazero
View on GitHub
A working AlphaZero implementation that's simple enough to be able to understand what's going on at a quick glance, without sacrificing t…
☆14Mar 23, 2023Updated 3 years ago
haifangong / VQAMix
View on GitHub
[IEEE TMI'22] VQAMix: Conditional Triplet Mixup for Medical Visual Question Answering
☆16Oct 9, 2022Updated 3 years ago
CCYChongyanChen / VQA_AlgorithmDatasets
View on GitHub
☆37Jan 20, 2023Updated 3 years ago
mrwu-mac / DIFNet
View on GitHub
[CVPR 2022] This repository is for the paper ``DIFNet: Boosting Visual Information Flow for Image Captioning'' .
☆21Nov 28, 2022Updated 3 years ago
yuleiniu / introd
View on GitHub
[NeurIPS 2021] Introspective Distillation for Robust Question Answering
☆13Dec 7, 2021Updated 4 years ago
Yvonneupup / DHNE
View on GitHub
This repository contains implementation of DHNE : Network Representation Learning Method for Dynamic Heterogeneous Network.
☆10May 11, 2019Updated 7 years ago
terenceylchow124 / Meme-MultiModal
View on GitHub
Multimodal Model for Memotion Dataset
☆12May 17, 2021Updated 5 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
Wangt-CN / MTFN-RR-PyTorch-Code
View on GitHub
The offical code for paper "Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking", ACM Multimedia 2019 Oral
☆67Sep 28, 2019Updated 6 years ago
Gary-code / KECVQG
View on GitHub
[ACM MM 2023] The released code of paper "Deconfounded Visual Question Generation with Causal Inference"
☆10Sep 3, 2024Updated last year
IMKBLE / DAMC
View on GitHub
☆10Sep 24, 2021Updated 4 years ago
cs-jerhuang / P-VQA
View on GitHub
Medical Knowledge-Based Network For Patient-oriented Visual Question Answering
☆19Feb 25, 2023Updated 3 years ago
MILVLG / activitynet-qa
View on GitHub
An VideoQA dataset based on the videos from ActivityNet
☆94Nov 22, 2020Updated 5 years ago
rentainhe / TRAR-VQA
View on GitHub
[ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"
☆68Oct 11, 2021Updated 4 years ago
landskape-ai / Progressive-Pruning
View on GitHub
Official pytorch code for "APP: Anytime Progressive Pruning" (DyNN @ ICML, 2022; CLL @ ACML, 2022, SNN @ ICML, 2022 and SlowDNN 2023)
☆16Nov 22, 2022Updated 3 years ago
gw-zhong / CMC
View on GitHub
Codes for "Calibrating Multimodal Consensus for Emotion Recognition".
☆19Oct 24, 2025Updated 9 months ago
jnhwkim / ban-vqa
View on GitHub
Bilinear attention networks for visual question answering
☆549Oct 30, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
giparisi / gwr-tb
View on GitHub
GWR Toolbox
☆14Jan 28, 2019Updated 7 years ago
AndresPMD / Clip_CMR
View on GitHub
CLIP-based simple image-text matching baseline for COCO and F30K
☆15Sep 16, 2021Updated 4 years ago
Xilinx / hydra
View on GitHub
☆14Feb 14, 2022Updated 4 years ago
volkancirik / groundnet
View on GitHub
Repository for AAAI 2018 paper "Using Syntax for Referring Expression Recognition"
☆13Oct 7, 2020Updated 5 years ago
ubc-vision / TriBERT
View on GitHub
Code Release for the paper "TriBERT: Full-body Human-centric Audio-visual Representation Learning for Visual Sound Separation" in NeurIPS…
☆14Dec 9, 2021Updated 4 years ago
yukang2017 / NAS-quantization
View on GitHub
The code for Joint Neural Architecture Search and Quantization
☆14Apr 10, 2019Updated 7 years ago
asdf0982 / vqa-mfb.pytorch
View on GitHub
This project is out of date, I don't remember the details inside...
☆85Dec 2, 2017Updated 8 years ago