saxenarohit / MovieSumLinks
☆13Updated 10 months ago
Alternatives and similar repositories for MovieSum
Users that are interested in MovieSum are comparing it to the libraries listed below
Sorting:
- ☆29Updated 10 months ago
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆44Updated 4 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆51Updated 6 months ago
- ☆36Updated 9 months ago
- This repo contains code for the paper "Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM"☆14Updated 2 months ago
- ☆50Updated 3 weeks ago
- helper functions for processing and integrating visual language information with Qwen-VL Series Model☆14Updated 9 months ago
- This repo contains code and data for ICLR 2025 paper MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs☆31Updated 3 months ago
- This is the code repo for our paper "Benchmarking Retrieval-Augmented Generation in Multi-Modal Contexts".☆34Updated 3 months ago
- Assessing Context-Aware Creative Intelligence in MLLMs☆21Updated this week
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆44Updated last year
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆85Updated 8 months ago
- A curated list of resources about long-context in large-language models and video understanding.☆31Updated last year
- ☆14Updated last month
- Official Implementation of APB (ACL 2025 main)☆28Updated 4 months ago
- YesBut - Multimodal Satire Comprehension Dataset☆17Updated 8 months ago
- This is the official project of paper: Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conver…☆19Updated 7 months ago
- LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models☆18Updated 3 months ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆39Updated last year
- ☆36Updated 2 years ago
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆28Updated 11 months ago
- ☆13Updated 6 months ago
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆39Updated 3 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆20Updated 6 months ago
- The official code repo and data hub of top_nsigma sampling strategy for LLMs.☆26Updated 4 months ago
- ☆38Updated 2 months ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆25Updated 3 months ago
- [ICCV 2025] Official Repository of VideoLLaMB: Long Video Understanding with Recurrent Memory Bridges☆70Updated 4 months ago
- Repository for the NeurIPS 2024 paper "SearchLVLMs: A Plug-and-Play Framework for Augmenting Large Vision-Language Models by Searching Up…☆24Updated 6 months ago
- ☆35Updated 9 months ago