Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Finding]"
☆15Aug 27, 2025Updated 6 months ago
Alternatives and similar repositories for Video-Skill-CoT
Users that are interested in Video-Skill-CoT are comparing it to the libraries listed below
Sorting:
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆35Mar 12, 2024Updated last year
- Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement"☆52Dec 5, 2024Updated last year
- Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"☆19Jan 18, 2026Updated last month
- (EMNLP 2025 Main) RACCooN: A Versatile Instructional Video Editing Framework with Auto-Generated Narratives☆37Dec 20, 2025Updated 2 months ago
- Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization☆25Apr 14, 2025Updated 10 months ago
- Streaming Video Instruction Tuning☆45Feb 25, 2026Updated last week
- ☆24Oct 8, 2023Updated 2 years ago
- [CVPR 2025] GPS as a Control Signal for Image Generation☆25Mar 18, 2025Updated 11 months ago
- This reviewer is based on the modules and exam guide provided via the Google Partner Kickstart program. I have taken the liberty of combi…☆11Dec 15, 2020Updated 5 years ago
- "Scalable and Order-robust Continual Learning with Additive Parameter Decomposition", ICLR 2020☆23May 25, 2022Updated 3 years ago
- ☆26Jun 20, 2024Updated last year
- [CVPR2025] Official repository for "VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide"☆28May 27, 2025Updated 9 months ago
- [NeurIPS 2025 Spotlight] Fast-Slow Thinking GRPO for Large Vision-Language Model Reasoning☆49Jan 20, 2026Updated last month
- [ICCV2025] VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation☆33Aug 18, 2025Updated 6 months ago
- [ACL 2025] The official pytorch implement of "MIND: A Multi-agent Framework for Zero-shot Harmful Meme Detection".☆25May 26, 2025Updated 9 months ago
- Official Code Repository for the paper "Graph Generation with Diffusion Mixture" (ICML 2024).☆37May 20, 2024Updated last year
- Pytorch Code and Data for EnvEdit: Environment Editing for Vision-and-Language Navigation (CVPR 2022)☆30Aug 2, 2022Updated 3 years ago
- Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning☆141Aug 21, 2025Updated 6 months ago
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆43Mar 11, 2025Updated 11 months ago
- ☆25Updated this week
- A framework for steering MoE models by detecting and controlling behavior-linked experts.☆29Sep 12, 2025Updated 5 months ago
- ☆10Jun 8, 2024Updated last year
- [ICCV 2025] "Player-Centric Multimodal Prompt Generation for Large Language Model Based Identity-Aware Basketball Video Captioning".☆17Dec 11, 2025Updated 2 months ago
- This contains a practical guide for non-technical users on how to use OpenAI's Whisper for transcription and translation☆12May 8, 2024Updated last year
- [CVPR 2026] Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO☆96Updated this week
- Official code for CustAny: Customizing Anything from A Single Example. Accepted by CVPR2025 (Oral)☆48Apr 10, 2025Updated 10 months ago
- ☆47Apr 20, 2025Updated 10 months ago
- TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning☆114Dec 24, 2025Updated 2 months ago
- PhysGame Benchmark for Physical Commonsense Evaluation in Gameplay Videos☆48Jul 3, 2025Updated 8 months ago
- official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]☆112Dec 4, 2025Updated 3 months ago
- HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models☆13Mar 6, 2025Updated last year
- 🎭 Official code and dataset for our CCGPK@COLING 2022 paper - "PersonaChatGen: Generating Personalized Dialogue using GPT-3"☆13Mar 26, 2024Updated last year
- An official codebase for "NormLens: Reading Books is Great, But Not if You Are Driving! Visually Grounded Reasoning about Defeasible Comm…☆10May 9, 2024Updated last year
- [ICLR 2025] SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image and Video Generation☆54Jan 22, 2025Updated last year
- Demo of orchestrating Airbyte connections with Prefect☆11Mar 3, 2022Updated 4 years ago
- This is the official Pytorch code for our paper "Artemis: Structured Visual Reasoning for Perception Policy Learning".☆14Dec 4, 2025Updated 3 months ago
- A Form feedback application on Linea☆12Jun 24, 2024Updated last year
- ☆12Feb 12, 2026Updated 3 weeks ago
- The official source code of our AAAI25 paper "D&M: Enriching E-commerce Videos with Sound Effects by Key Moment Detection and SFX Matchin…☆10Feb 9, 2025Updated last year