WenliangGuo/SCHEMA

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/WenliangGuo/SCHEMA)

WenliangGuo / SCHEMA

[ICLR 2024 Poster] SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos

☆20

Alternatives and similar repositories for SCHEMA

Users that are interested in SCHEMA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

WiserZhou / MTID
View on GitHub
Official PyTorch Implementation of Masked Temporal Interpolation Diffusion for Procedure Planning in Instructional Videos
☆11Jul 10, 2026Updated 2 weeks ago
facebookresearch / htstep
View on GitHub
HT-Step is a large-scale article grounding dataset of temporal step annotations on how-to videos
☆26Mar 20, 2024Updated 2 years ago
zjuchenlong / WSAG
View on GitHub
[EMNLP'22] Weakly-Supervised Temporal Article Grounding
☆14Nov 25, 2023Updated 2 years ago
Ravindu-Yasas-Nagasinghe / KEPP
View on GitHub
[CVPR 2024] KEPP: Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of Instructional Videos
☆12Sep 24, 2024Updated last year
yuleiniu / introd
View on GitHub
[NeurIPS 2021] Introspective Distillation for Robust Question Answering
☆13Dec 7, 2021Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
salesforce / paprika
View on GitHub
Code for CVPR 2023 paper "Procedure-Aware Pretraining for Instructional Video Understanding"
☆50Jun 2, 2026Updated last month
jutanke / social_diffusion
View on GitHub
Re-implementation for ICCV23 "Social Diffusion: Long-term Multiple Human Motion Anticipation"
☆24Oct 3, 2023Updated 2 years ago
soCzech / MultiTaskObjectStates
View on GitHub
Code for the paper "Multi-Task Learning of Object States and State-Modifying Actions from Web Videos" published in TPAMI
☆11Mar 3, 2024Updated 2 years ago
gqa-ood / GQA-OOD
View on GitHub
GQA-OOD is a new dataset and benchmark for the evaluation of VQA models in OOD (out of distribution) settings.
☆33Mar 1, 2021Updated 5 years ago
dpfried / action-segmentation
View on GitHub
Weakly-supervised action segmentation in video
☆16Feb 13, 2022Updated 4 years ago
ruotianluo / rtutils
View on GitHub
☆17Sep 2, 2023Updated 2 years ago
facebookresearch / TaskGraph
View on GitHub
Official code repository for "Video-Mined Task Graphs for Keystep Recognition in Instructional Videos" arXiv, 2023
☆15Apr 1, 2024Updated 2 years ago
olga-zats / GTDA
View on GitHub
[ECCV2024] Gated Temporal Action Anticipation for Stochastic Long-Term Anticipation
☆24May 29, 2025Updated last year
ChinaYi / asrf_with_asformer
View on GitHub
Replace the MS-TCN with ASFormer in asrf
☆23Oct 28, 2021Updated 4 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
facebookresearch / VLaMP
View on GitHub
Code for “Pretrained Language Models as Visual Planners for Human Assistance”
☆64Jun 12, 2023Updated 3 years ago
facebookresearch / ProcedureVRL
View on GitHub
[CVPR 2023] Official code for "Learning Procedure-aware Video Representation from Instructional Videos and Their Narrations"
☆56Aug 8, 2023Updated 2 years ago
cdancette / vqa-cp-leaderboard
View on GitHub
A collections of papers about VQA-CP datasets and their results
☆42Mar 18, 2022Updated 4 years ago
ChopinSharp / ref-nms
View on GitHub
Official codebase for "Ref-NMS: Breaking Proposal Bottlenecks in Two-Stage Referring Expression Grounding"
☆22Dec 20, 2020Updated 5 years ago
yyf17 / SAAVN
View on GitHub
SAAVN Code release for paper "Sound Adversarial Audio-Visual Navigation,ICLR2022" (In PyTorch)
☆21Nov 9, 2022Updated 3 years ago
jialinwu17 / self_critical_vqa
View on GitHub
Code for NeurIPS 2019 paper ``Self-Critical Reasoning for Robust Visual Question Answering''
☆40Sep 9, 2019Updated 6 years ago
facebookresearch / VidOSC
View on GitHub
Code and data release for the paper "Learning Object State Changes in Videos: An Open-World Perspective" (CVPR 2024)
☆37Sep 9, 2024Updated last year
robert80203 / EgoPER_official
View on GitHub
The official implementation of Error Detection in Egocentric Procedural Task Videos
☆33Sep 20, 2025Updated 10 months ago
chentong0 / rl-binary-rar
View on GitHub
Official repo for "Binary Retrieval-augmented Reward Mitigates Hallucinations"
☆15Nov 13, 2025Updated 8 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
olga-zats / goal_consistency
View on GitHub
[ICIP2023] Code for the paper 'Action Anticipation with Goal Consistency'
☆12Apr 5, 2024Updated 2 years ago
cyfml / OPSTL
View on GitHub
OPSTL: Self-supervised Skeleton-based Action Recognition in Occluded Environments
☆14Oct 25, 2023Updated 2 years ago
Finspire13 / DiffAct
View on GitHub
Code for Diffusion Action Segmentation (ICCV 2023)
☆78Aug 16, 2023Updated 2 years ago
filteredcophy / FilteredCoPhy
View on GitHub
☆10Nov 17, 2022Updated 3 years ago
SpencerWhitehead / novelvqa
View on GitHub
☆27Oct 7, 2021Updated 4 years ago
IDEA-XL / SubgDiff
View on GitHub
The official implementation of NeurIPS2024 paper "SubgDiff: A Subgraph Diffusion Model to Improve Molecular Representation Learning."
☆11May 28, 2025Updated last year
HankKung / Auto-Dynamic-DeepLab
View on GitHub
[IROS 2021] ADD: A Fine-grained Dynamic Inference Architecture for Semantic Image Segmentation
☆10May 3, 2022Updated 4 years ago
limanling / KnowledgeVL-Reading
View on GitHub
☆67Jun 18, 2023Updated 3 years ago
soCzech / LookForTheChange
View on GitHub
Code for Look for the Change paper published at CVPR 2022
☆36Oct 26, 2022Updated 3 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
cdancette / rubi.bootstrap.pytorch
View on GitHub
NeurIPS 2019 Paper: RUBi : Reducing Unimodal Biases for Visual Question Answering
☆66Mar 29, 2021Updated 5 years ago
omron-sinicx / com_kitchens
View on GitHub
COM Kitchens: An Unedited Overhead-view Video Dataset as a Vision-Language Benchmark
☆15Aug 22, 2024Updated last year
LTContext / LTContext
View on GitHub
[ICCV 2023] How Much Temporal Long-Term Context is Needed for Action Segmentation?
☆50Jun 21, 2024Updated 2 years ago
Yuhan-Shen / ProTAS
View on GitHub
Progress-Aware Online Action Segmentation for Egocentric Procedural Task Videos
☆38Sep 9, 2024Updated last year
dipika-singhania / ICC-Semi-Supervised-TAS
View on GitHub
Iterative Contrast-Classify For Semi-supervised Temporal Action Segmentation
☆11Jul 24, 2023Updated 3 years ago
yuleiniu / cfvqa
View on GitHub
[CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias
☆136Dec 15, 2021Updated 4 years ago
zihuixue / seeAoT
View on GitHub
Code and data release for the paper "Seeing the Arrow of Time in Large Multimodal Models"
☆16Oct 2, 2025Updated 9 months ago