ztyang23 / BACONLinks
☆17Updated last year
Alternatives and similar repositories for BACON
Users that are interested in BACON are comparing it to the libraries listed below
Sorting:
- ☆58Updated 2 years ago
- Latest Papers, Codes and Datasets on VTG-LLMs.☆19Updated 2 weeks ago
- [CVPR 2024] The official implementation of paper "Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training"☆36Updated last year
- FleVRS: Towards Flexible Visual Relationship Segmentation, NeurIPS 2024☆21Updated 8 months ago
- [ICLR 2023] CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding☆45Updated 2 months ago
- 🔥 [CVPR 2024] Official implementation of "See, Say, and Segment: Teaching LMMs to Overcome False Premises (SESAME)"☆42Updated last year
- [CVPR 2025] Test-Time Visual In-Context Tuning☆25Updated 5 months ago
- https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT☆72Updated 3 weeks ago
- ☆38Updated last month
- The offical implemention of JM3D.☆30Updated 2 weeks ago
- Code and dataset link for "DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World"☆100Updated last month
- ☆26Updated 4 months ago
- Can 3D Vision-Language Models Truly Understand Natural Language?☆21Updated last year
- Learning 1D Causal Visual Representation with De-focus Attention Networks☆35Updated last year
- Offical repo for CAT-V - Caption Anything in Video: Object-centric Dense Video Captioning with Spatiotemporal Multimodal Prompting☆51Updated last month
- [CVPR'24 Highlight] The official code and data for paper "EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Lan…☆61Updated 5 months ago
- High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning☆47Updated last month
- [NeurIPS 2023] OV-PARTS: Towards Open-Vocabulary Part Segmentation☆88Updated last year
- ☆12Updated last year
- Test-Time Training on Video Streams☆64Updated 2 years ago
- An official repo for WACV 2025 paper "LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spa…☆21Updated 7 months ago
- (ICLR 2024, CVPR 2024) SparseFormer☆75Updated 9 months ago
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆38Updated 6 months ago
- Open-vocabulary Semantic Segmentation☆33Updated last year
- Unifying Specialized Visual Encoders for Video Language Models☆22Updated last month
- [cvpr2023] implementation of out-of-candidate rectification methods☆15Updated 2 years ago
- [NeurIPS 2024] Official PyTorch implementation of "Improving Compositional Reasoning of CLIP via Synthetic Vision-Language Negatives"☆41Updated 9 months ago
- [ICLR 2025] IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model☆35Updated 9 months ago
- ROOT: VLM based System for Indoor Scene Understanding and Beyond☆32Updated 7 months ago
- Codes for ICLR 2025 Paper: Towards Semantic Equivalence of Tokenization in Multimodal LLM☆70Updated 4 months ago