yliu-cs / PiTeLinks
[ECCV'24 Oral] PiTe: Pixel-Temporal Alignment for Large Video-Language Model
☆17Updated 7 months ago
Alternatives and similar repositories for PiTe
Users that are interested in PiTe are comparing it to the libraries listed below
Sorting:
- ☆23Updated 3 months ago
- TEMPURA enables video-language models to reason about causal event relationships and generate fine-grained, timestamped descriptions of u…☆22Updated 3 months ago
- Text-Only Data Synthesis for Vision Language Model Training☆22Updated 3 months ago
- [ECCV 2024] Learning Video Context as Interleaved Multimodal Sequences☆40Updated 6 months ago
- On Path to Multimodal Generalist: General-Level and General-Bench