ZJULearning/TreeAttention

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ZJULearning/TreeAttention)

ZJULearning / TreeAttention

A Better Way to Attend: Attention with Trees for Video Question Answering

☆25

Alternatives and similar repositories for TreeAttention

Users that are interested in TreeAttention are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

fanchenyou / HME-VideoQA
View on GitHub
Heterogeneous Memory Enhanced Multimodal Attention Model for VideoQA
☆55Sep 13, 2021Updated 4 years ago
andreaazzini / multidensenet
View on GitHub
A PyTorch implementation of DenseNet, supporting multiclass and multilabel classification.
☆24Aug 11, 2017Updated 8 years ago
jhyuklee / dmn-pytorch
View on GitHub
Re-implementation: Ask Me Anything: Dynamic Memory Networks for Natural Language Processing
☆14Apr 7, 2019Updated 7 years ago
agakshat / visualdialog-pytorch
View on GitHub
Community Regularization of Visually Grounded Dialog https://arxiv.org/abs/1808.04359
☆15May 16, 2019Updated 7 years ago
jayleicn / TVQAplus
View on GitHub
[ACL 2020] PyTorch code for TVQA+: Spatio-Temporal Grounding for Video Question Answering
☆132Oct 25, 2022Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
YunseokJANG / tgif-qa
View on GitHub
Repository for our CVPR 2017 and IJCV: TGIF-QA
☆180Sep 6, 2021Updated 4 years ago
dialogtekgeek / AudioVisualSceneAwareDialog
View on GitHub
☆27May 4, 2020Updated 6 years ago
xiaojino / RUArt
View on GitHub
RUArt: A Novel Text-Centered Solution for Text-Based Visual Question Answering
☆10Nov 27, 2022Updated 3 years ago
wh0330 / CAG_VisDial
View on GitHub
☆15Aug 13, 2020Updated 5 years ago
xudejing / video-question-answering
View on GitHub
Video Question Answering via Gradually Refined Attention over Appearance and Motion
☆178Dec 5, 2017Updated 8 years ago
JonghwanMun / MarioQA
View on GitHub
Repository for MarioQA: Answering Questions by Watching Gameplay Videos in ICCV 2017
☆10Oct 28, 2025Updated 8 months ago
VisionLearningGroup / Ask_Attend_and_Answer
View on GitHub
Ask, Attend and Answer: Exploring Question-Guided Spatial Attention for Visual Question Answering
☆25Nov 4, 2020Updated 5 years ago
JunweiLiang / DualAttentionNetwork
View on GitHub
This repository contains the tensorflow implementation and models for DAN - CVPR 2017 paper
☆22Jul 13, 2018Updated 8 years ago
idansc / simple-avsd
View on GitHub
Code for ''A Simple Baseline for Audio-Visual Scene-Aware Dialog``
☆27May 26, 2020Updated 6 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
StanfordVL / STGraph
View on GitHub
Codebase for CVPR 2020 paper "Spatio-Temporal Graph for Video Captioning with Knowledge Distillation"
☆23Mar 4, 2020Updated 6 years ago
thaolmk54 / hcrn-videoqa
View on GitHub
Implementation for the paper "Hierarchical Conditional Relation Networks for Video Question Answering" (Le et al., CVPR 2020, Oral)
☆135Jul 25, 2024Updated last year
bupt-cist / vqa-playground-pytorch
View on GitHub
Code for NIPS 2018 paper, "Chain of Reasoning for Visual Question Answering"
☆28Nov 23, 2018Updated 7 years ago
VITA-Group / layerGraftedPretraining_ICLR23
View on GitHub
[ICLR 2023] “ Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Better Representations”, Ziyu Jian…
☆24Feb 16, 2023Updated 3 years ago
edchengg / VAE_GAN
View on GitHub
VAE+GAN
☆10Apr 18, 2018Updated 8 years ago
taeho-kil / Scene-Text-Rectification
View on GitHub
Scene text rectification using glyph and character alignment properties
☆22Jan 21, 2018Updated 8 years ago
deep-spin / OpenNMT-entmax
View on GitHub
☆15May 14, 2019Updated 7 years ago
yuleiniu / rva
View on GitHub
Code for CVPR'19 "Recursive Visual Attention in Visual Dialog"
☆64Mar 24, 2023Updated 3 years ago
salesforce / BiST
View on GitHub
Code for the paper BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues (EMNLP20)
☆11Jun 16, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
josiehong / Multi-dilated_CAM
View on GitHub
A personal implement of CAM(class activation mapping) part of CVPR 2018 Revisiting Dilated Convolution: A Simple Approach for Weakly- and…
☆15Jul 15, 2019Updated 7 years ago
sail-sg / VGT
View on GitHub
Video Graph Transformer for Video Question Answering (ECCV'22)
☆49Jun 8, 2023Updated 3 years ago
HongyangGao / hConv-gPool-Net
View on GitHub
TensorFlow implementation of Learning Graph Pooling and Hybrid Convolutional Operations for Text Representations (WWW19)
☆27Mar 31, 2019Updated 7 years ago
ternaus / kaggle_planet
View on GitHub
Planet: Understanding the Amazon from Space
☆12Jul 23, 2017Updated 9 years ago
raingo / TGIF-Release
View on GitHub
Animated GIF Description Dataset
☆117Jun 17, 2024Updated 2 years ago
zilongzheng / visdial-gnn
View on GitHub
PyTorch code for Reasoning Visual Dialogs with Structural and Partial Observations
☆42Jun 30, 2021Updated 5 years ago
LuoweiZhou / pytorch-pretrained-BERT
View on GitHub
📖The Big-&-Extending-Repository-of-Transformers: Pretrained PyTorch models for Google's BERT, OpenAI GPT & GPT-2, Google/CMU Transformer…
☆11May 30, 2019Updated 7 years ago
yashkant / sam-textvqa
View on GitHub
Official code for paper "Spatially Aware Multimodal Transformers for TextVQA" published at ECCV, 2020.
☆65Sep 15, 2021Updated 4 years ago
hanxiaoheihei / BIT_QA_System
View on GitHub
问答系统前后端
☆15Jan 25, 2021Updated 5 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
logxio / picchio
View on GitHub
Catches silent CPU fallback and mislabeled tok/s in local LLMs. llama.cpp and ollama, one file, no deps.
☆17Jul 16, 2026Updated last week
guoswang / TensorBoard
View on GitHub
A simple example for visualizing tf-code
☆17Jan 22, 2018Updated 8 years ago
HeroKillerEver / SeqGAN-Pytorch
View on GitHub
An optimized version of SeqGAN in pytorch
☆12Apr 24, 2018Updated 8 years ago
Cadene / murel.bootstrap.pytorch
View on GitHub
MUREL (CVPR 2019), a multimodal relational reasoning module for VQA
☆194Feb 9, 2020Updated 6 years ago
gicheonkang / dan-visdial
View on GitHub
✨ Official PyTorch Implementation for EMNLP'19 Paper, "Dual Attention Networks for Visual Reference Resolution in Visual Dialog"
☆44Mar 19, 2023Updated 3 years ago
HCIILAB / LAST
View on GitHub
Read Ten Lines at One Glance: Line-Aware Semi-Autoregressive Transformer for Multi-Line Handwritten Mathematical Expression Recognition
☆28Aug 29, 2023Updated 2 years ago
ionmadrazo / Vec2Read
View on GitHub
☆10Oct 3, 2023Updated 2 years ago