Open LLaMA Eyes to See the World
☆175Apr 16, 2023Updated 2 years ago
Alternatives and similar repositories for Visual-LLaMA
Users that are interested in Visual-LLaMA are comparing it to the libraries listed below
Sorting:
- VaLM: Visually-augmented Language Modeling. ICLR 2023.☆56Mar 6, 2023Updated 3 years ago
- A practice for million-scale multi-domain universal object detection☆28Jun 13, 2024Updated last year
- Open ChatGLM Eyes to See the World☆13Mar 30, 2023Updated 2 years ago
- Paper List for In-context Learning 🌷☆20Jan 3, 2023Updated 3 years ago
- LGEB: Benchmark of Language Generation Evaluation☆16Oct 21, 2022Updated 3 years ago
- music generation with perceiver-ar model☆26Jul 20, 2022Updated 3 years ago
- ☆17Oct 18, 2022Updated 3 years ago
- [CVPR2023] This is an official implementation of paper "DETRs with Hybrid Matching".☆14Sep 1, 2022Updated 3 years ago
- Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks (NeurIPS2022)☆85Nov 2, 2022Updated 3 years ago
- Revisiting Contrastive Methods for Unsupervised Learning of Visual Representations. [NeurIPS 2021]☆89Oct 2, 2021Updated 4 years ago
- MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.☆953Mar 19, 2025Updated 11 months ago
- ☆46Aug 25, 2021Updated 4 years ago
- VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks☆391Jul 9, 2024Updated last year
- [Under preparation] Code repo for "Open-Vocabulary DETR with Conditional Matching" (ECCV 2022)☆237Aug 3, 2022Updated 3 years ago
- Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning☆20Feb 4, 2022Updated 4 years ago
- [ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters☆5,933Mar 14, 2024Updated last year
- ☆55Feb 9, 2023Updated 3 years ago
- ICLR 2023 DeCap: Decoding CLIP Latents for Zero-shot Captioning☆138Mar 16, 2023Updated 2 years ago
- Official Repository of ChatCaptioner☆469Apr 13, 2023Updated 2 years ago
- An open-source framework for training large multimodal models.☆4,071Aug 31, 2024Updated last year
- Source code for COLING 2022 paper "Automatic Label Sequence Generation for Prompting Sequence-to-sequence Models"☆24Sep 21, 2022Updated 3 years ago
- Official PyTorch implementation of the paper "DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP Training".☆58Aug 2, 2023Updated 2 years ago
- mPLUG-Owl: The Powerful Multi-modal Large Language Model Family☆2,540Apr 2, 2025Updated 11 months ago
- [NeurIPS 2023] Official implementations of "Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large Language Models"☆525Jan 27, 2024Updated 2 years ago
- [ICLR 2023] PyTorch implementation of VLDet (https://arxiv.org/abs/2211.14843)☆190Mar 22, 2024Updated last year
- Unified Object Tracking Framework☆51Jun 20, 2022Updated 3 years ago
- Image Editing Anything☆116Apr 11, 2023Updated 2 years ago
- PyTorch Implementation of Region Similarity Representation Learning (ReSim)☆89Jul 27, 2021Updated 4 years ago
- [AAAI 2023] DQ-DETR: Dual Query Detection Transformer for Phrase Extraction and Grounding☆57Nov 28, 2022Updated 3 years ago
- Multi-language Enhanced LLaMA☆303Apr 13, 2023Updated 2 years ago
- ☆92Nov 25, 2023Updated 2 years ago
- ☆43Jun 1, 2023Updated 2 years ago
- SVIT: Scaling up Visual Instruction Tuning☆166Jun 20, 2024Updated last year
- [IJCV 2025] VLPrompt-PSG: Vision-Language Prompting for Panoptic Scene Graph Generation☆28Sep 24, 2024Updated last year
- [CVPR2023] All in One: Exploring Unified Video-Language Pre-training☆281Mar 25, 2023Updated 2 years ago
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆40Nov 11, 2024Updated last year
- Locally Enhanced Self-Attention: Rethinking Self-Attention as Local and Context Terms☆11Nov 29, 2021Updated 4 years ago
- Progressive Language-guided Visual Learning for Multi-Task Visual Grounding☆13May 9, 2025Updated 9 months ago
- EVA Series: Visual Representation Fantasies from BAAI☆2,648Aug 1, 2024Updated last year