Skyline-9 / Visionary-Vids

Multi-modal transformer approach for natural language query based joint video summarization and highlight detection
11Updated 6 months ago

Related projects

Alternatives and complementary repositories for Visionary-Vids