mrseanryan / finetune_LLaVA
Fine tune LLaVA 1.5 - based on article by wandb
☆11Updated last year
Alternatives and similar repositories for finetune_LLaVA:
Users that are interested in finetune_LLaVA are comparing it to the libraries listed below
- This repository compiles a list of papers related to Video LLM.☆19Updated 8 months ago
- Official code for our paper "Harnessing Uncertainty-aware Bounding Boxes for Unsupervised 3D Object Detection".☆15Updated 4 months ago
- Official repo for our ECCV'24 paper: Approaching Outside: Scaling Unsupervised 3D Object Detection from 2D Scene.☆33Updated 6 months ago
- The official implementation of Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion☆36Updated last week
- Vision-oriented multimodal AI☆49Updated 8 months ago
- [ECAI 2023] MonoSKD: General Distillation Framework for Monocular 3D Object Detection via Spearman Correlation Coefficient☆30Updated last year
- ☆19Updated last year
- [ICCV 23] A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection☆11Updated 10 months ago
- ☆30Updated this week
- MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning☆62Updated 11 months ago
- LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding☆22Updated last week
- Curricular Object Manipulation in LiDAR-based Object Detection(CVPR 2023)☆37Updated last year
- ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration☆24Updated 2 months ago
- [IEEE TCSVT] Official Pytorch Implementation of CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation.☆38Updated 2 months ago
- ☆31Updated 3 months ago
- Harnessing CLIP, DINO and SAM for Open Vocabulary Segmentation☆43Updated this week
- 【IEEE T-IV】A systematic survey of multi-modal and multi-task visual understanding foundation models for driving scenarios☆49Updated 9 months ago
- 3DGraphLLM is a model that uses a 3D scene graph and an LLM to perform 3D vision-language tasks.☆39Updated 2 months ago
- Public repository for the ECCV 2024 paper "Train Till You Drop: Towards Stable and Robust Source-free Unsupervised 3D Domain Adaptation".☆22Updated 5 months ago
- Language Driven Occupancy Prediction☆15Updated 2 months ago
- ☆16Updated last year
- Interface for GenAI-Arena☆13Updated last year
- ☆41Updated last year
- Official implementation of CVPR 2024 paper "Retrieval-Augmented Open-Vocabulary Object Detection".☆33Updated 5 months ago
- Taming Self-Training for Open-Vocabulary Object Detection, CVPR 2024☆21Updated last year
- (ICLR 2024, CVPR 2024) SparseFormer☆73Updated 3 months ago
- Project for "LaSagnA: Language-based Segmentation Assistant for Complex Queries".☆53Updated 10 months ago
- ☆25Updated last year