TencentARC/ARC-Chapter

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/TencentARC/ARC-Chapter)

TencentARC / ARC-Chapter

Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries

☆43

Alternatives and similar repositories for ARC-Chapter

Users that are interested in ARC-Chapter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

TencentARC / OmniScript
View on GitHub
OmniScript: Towards Audio-Visual Script Generation for Long-Form Cinematic Video
☆18Apr 24, 2026Updated 2 months ago
TencentARC / ARC-Hunyuan-Video-7B
View on GitHub
Structured Video Comprehension of Real-World Shorts
☆238Sep 21, 2025Updated 10 months ago
TencentARC / Video-Holmes
View on GitHub
[ECCV 2026] Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?
☆95Jul 13, 2025Updated last year
TencentARC / TimeLens
View on GitHub
[CVPR 2026] TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
☆158Apr 27, 2026Updated 2 months ago
NIneeeeeem / LangDC
View on GitHub
[EMNLP 2025 Oral] Official codebase for Seeing More, Saying More: Lightweight Language Experts are Dynamic Video Token Compressors.
☆18Sep 7, 2025Updated 10 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
lucas-ventura / chapter-llama
View on GitHub
Official PyTorch implementation of the paper "Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs"
☆99Jun 6, 2025Updated last year
TencentARC / FLM
View on GitHub
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
☆31May 15, 2023Updated 3 years ago
TencentARC / GRPO-CARE
View on GitHub
[ACL2026 Findings] GRPO-CARE: Consistency-Aware Reinforcement Learning for Multimodal Reasoning
☆83Jun 23, 2025Updated last year
zihuixue / ProgCaptioner
View on GitHub
Code release for the paper "Progress-Aware Video Frame Captioning" (CVPR 2025)
☆26Jul 16, 2025Updated last year
yunlong10 / AVicuna
View on GitHub
[AAAI 2025] Empowering LLMs with Pseudo-Untrimmed Videos for Audio-Visual Temporal Understanding
☆34Mar 21, 2025Updated last year
TencentARC / TokLIP
View on GitHub
TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation
☆236Aug 18, 2025Updated 11 months ago
mbzuai-oryx / LongShOT
View on GitHub
A Benchmark and Agentic Framework for Omni-Modal Reasoning and Tool Use in Long Videos
☆21Jun 20, 2026Updated last month
TencentARC / MindOmni
View on GitHub
[NeurIPS2025] The official implementation of MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO
☆139Oct 15, 2025Updated 9 months ago
yunlong10 / CAT-V
View on GitHub
[AAAI 26 Demo] Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal P…
☆67Jan 27, 2026Updated 5 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
TencentARC / BTS
View on GitHub
BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild
☆33Apr 16, 2024Updated 2 years ago
zjr2000 / LLMVA-GEBC
View on GitHub
Winner solution to Generic Event Boundary Captioning task in LOVEU Challenge (CVPR 2023 workshop)
☆29Jan 1, 2024Updated 2 years ago
zjr2000 / GVL
View on GitHub
Official implementation for paper Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
☆28Dec 8, 2023Updated 2 years ago
FeiElysia / Tempo
View on GitHub
Tempo: Small Vision-Language Models are Smart Compressors for Long Video Understanding (ECCV 2026)
☆76Jun 29, 2026Updated 3 weeks ago
zjr2000 / Untrimmed-Video-Feature-Extractor
View on GitHub
A simple and effective feature extractor for untrimmed videos
☆13Sep 1, 2022Updated 3 years ago
TencentARC / BlobCtrl
View on GitHub
[SIGGRAPH ASIA'25] BlobCtrl: Taming Controllable Blob for Element-level Image Editing
☆22Nov 14, 2025Updated 8 months ago
nusnlp / d2vlm
View on GitHub
[ICCV 2025] Factorized Learning for Temporally Grounded Video-Language Models
☆24Apr 18, 2026Updated 3 months ago
qiulu66 / Anime-Shooter
View on GitHub
☆55Jun 4, 2025Updated last year
mashijie1028 / GenHancer
View on GitHub
(ICCV 2025) Enhance CLIP and MLLM's fine-grained visual representations with generative models.
☆78Jun 25, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
yongliu20 / Awesome-Unified-Understanding-and-Generation
View on GitHub
☆52Aug 22, 2025Updated 10 months ago
Ziyang412 / VideoTree
View on GitHub
Code for CVPR25 paper "VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos"
☆165Jun 23, 2025Updated last year
alibaba-mmai-research / HiCo
View on GitHub
CVPR2022:Learning from Untrimmed Videos: Self-Supervised Video Representation Learning with Hierarchical Consistency
☆18Aug 10, 2022Updated 3 years ago
MCG-NJU / CaReBench
View on GitHub
A Fine-grained Benchmark for Video Captioning and Retrieval
☆30Jul 16, 2025Updated last year
xpeng-robotics / UniT
View on GitHub
☆92Jun 2, 2026Updated last month
TencentARC / mllm-npu
View on GitHub
mllm-npu: training multimodal large language models on Ascend NPUs
☆95Aug 29, 2024Updated last year
SaraGhazanfari / CoF
View on GitHub
Chain-of-Frames [CVPR 2026]
☆40Jul 2, 2025Updated last year
loongfeili / Martian-World-Model
View on GitHub
[NeurIPS 2025] Official repo of "Martian World Model: Controllable Video Synthesis with Physically Accurate 3D Reconstructions"
☆20Aug 6, 2025Updated 11 months ago
yhy-2000 / MomentSeeker
View on GitHub
☆23Jul 23, 2025Updated 11 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Tencent / HaploVLM
View on GitHub
ICML2025
☆63Aug 28, 2025Updated 10 months ago
gyxxyg / VTG-LLM
View on GitHub
[AAAI 2025] VTG-LLM: Integrating Timestamp Knowledge into Video LLMs for Enhanced Video Temporal Grounding
☆130Dec 10, 2024Updated last year
ttengwang / Awesome_Long_Form_Video_Understanding
View on GitHub
Awesome papers & datasets specifically focused on long-term videos.
☆381Oct 9, 2025Updated 9 months ago
zjr2000 / REVERIE
View on GitHub
[ECCV2024] Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
☆20Jul 17, 2024Updated 2 years ago
minjoong507 / Consistency-of-Video-LLM
View on GitHub
[CVPR 2025] Official Repository of the paper "On the Consistency of Video Large Language Models in Temporal Comprehension"
☆16Oct 13, 2025Updated 9 months ago
ZiyuGuo99 / MME-CoF
View on GitHub
Are Video Models Ready as Zero-shot Reasoners?
☆87Nov 24, 2025Updated 7 months ago
PolyU-ChenLab / ETBench
View on GitHub
👾 E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding (NeurIPS 2024)
☆74Jan 20, 2025Updated last year