daeunni / Video-Skill-CoTView external linksLinks
Code for "Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning [EMNLP 2025 Finding]"
☆15Aug 27, 2025Updated 5 months ago
Alternatives and similar repositories for Video-Skill-CoT
Users that are interested in Video-Skill-CoT are comparing it to the libraries listed below
Sorting:
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆35Mar 12, 2024Updated last year
- Code for "VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement"☆52Dec 5, 2024Updated last year
- Official Implementation for "SiLVR : A Simple Language-based Video Reasoning Framework"☆19Jan 18, 2026Updated 3 weeks ago
- Streaming Video Instruction Tuning☆38Feb 4, 2026Updated last week
- (EMNLP 2025 Main) RACCooN: A Versatile Instructional Video Editing Framework with Auto-Generated Narratives☆37Dec 20, 2025Updated last month
- Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization☆24Apr 14, 2025Updated 10 months ago
- ☆24Oct 8, 2023Updated 2 years ago
- [CVPR 2025] GPS as a Control Signal for Image Generation☆25Mar 18, 2025Updated 10 months ago
- [NeurIPS 2025 Spotlight] Fast-Slow Thinking GRPO for Large Vision-Language Model Reasoning☆40Jan 20, 2026Updated 3 weeks ago
- "Scalable and Order-robust Continual Learning with Additive Parameter Decomposition", ICLR 2020☆23May 25, 2022Updated 3 years ago
- This reviewer is based on the modules and exam guide provided via the Google Partner Kickstart program. I have taken the liberty of combi…☆11Dec 15, 2020Updated 5 years ago
- [CVPR2025] Official repository for "VideoGuide: Improving Video Diffusion Models without Training Through a Teacher's Guide"☆28May 27, 2025Updated 8 months ago
- ☆26Jun 20, 2024Updated last year
- [ICCV2025] VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation☆33Aug 18, 2025Updated 5 months ago
- [ACL 2025] The official pytorch implement of "MIND: A Multi-agent Framework for Zero-shot Harmful Meme Detection".☆26May 26, 2025Updated 8 months ago
- Official Code Repository for the paper "Graph Generation with Diffusion Mixture" (ICML 2024).☆36May 20, 2024Updated last year
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆42Mar 11, 2025Updated 11 months ago
- Pytorch Code and Data for EnvEdit: Environment Editing for Vision-and-Language Navigation (CVPR 2022)☆30Aug 2, 2022Updated 3 years ago
- Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning☆139Aug 21, 2025Updated 5 months ago
- Video-as-Answer: Predict and Generate Next Video Event with Joint-GRPO☆92Dec 1, 2025Updated 2 months ago
- ☆10Jun 8, 2024Updated last year
- This contains a practical guide for non-technical users on how to use OpenAI's Whisper for transcription and translation☆12May 8, 2024Updated last year
- [ICCV 2025] "Player-Centric Multimodal Prompt Generation for Large Language Model Based Identity-Aware Basketball Video Captioning".☆15Dec 11, 2025Updated 2 months ago
- A framework for steering MoE models by detecting and controlling behavior-linked experts.☆29Sep 12, 2025Updated 5 months ago
- Official code for CustAny: Customizing Anything from A Single Example. Accepted by CVPR2025 (Oral)☆48Apr 10, 2025Updated 10 months ago
- ☆47Apr 20, 2025Updated 9 months ago
- PhysGame Benchmark for Physical Commonsense Evaluation in Gameplay Videos☆47Jul 3, 2025Updated 7 months ago
- TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning☆114Dec 24, 2025Updated last month
- official repo for "VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation" [EMNLP2024]☆111Dec 4, 2025Updated 2 months ago
- ☆11Dec 5, 2025Updated 2 months ago
- Exposing Text-Image Inconsistency Using Diffusion Models (ICLR 2024)☆10Jun 15, 2024Updated last year
- Fine-tuning Llama2-7b and other llms for categorising emails for Deutsche Bahn (German National Railways)☆13Oct 9, 2023Updated 2 years ago
- [ICLR 2025] SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image and Video Generation☆53Jan 22, 2025Updated last year
- 🎭 Official code and dataset for our CCGPK@COLING 2022 paper - "PersonaChatGen: Generating Personalized Dialogue using GPT-3"☆13Mar 26, 2024Updated last year
- [ICLR 2026] Official Implementation of ProxyThinker: Test-Time Guidance through Small Visual Reasoners.☆19Sep 24, 2025Updated 4 months ago
- A Form feedback application on Linea☆12Jun 24, 2024Updated last year
- FastAPI wrapper for LLM, a fork of (oobabooga / text-generation-webui)☆10Jun 1, 2023Updated 2 years ago
- ☆13May 21, 2024Updated last year
- [ICLR 2026] Official repo for "FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting"☆37Oct 9, 2025Updated 4 months ago