llp1992 / Kanva
☆11Updated 9 months ago
Related projects: ⓘ
- [ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".☆22Updated 7 months ago
- REVO-LION: Evaluating and Refining Vision-Language Instruction Tuning Datasets☆11Updated 11 months ago
- [CVPR 2024] DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model☆15Updated 5 months ago
- ☆20Updated 9 months ago
- A light-weight and high-efficient training framework for accelerating diffusion tasks.☆13Updated last week
- This repository is for the first survey on SAM for videos.☆11Updated last month
- Benchmarking Attention Mechanism in Vision Transformers.☆16Updated last year
- [CVPR2023] This is an official implementation of paper "DETRs with Hybrid Matching".☆14Updated 2 years ago
- A curated list of papers and resources for text-to-image evaluation.☆26Updated last year
- Paper List for In-context Learning 🌷☆20Updated last year
- A benchmark dataset for evaluating LLM's SVG editing capabilities☆13Updated 4 months ago
- ☆22Updated 8 months ago
- OpenMMLab Detection Toolbox and Benchmark for V3Det☆15Updated 5 months ago
- ScaleNet: Searching for the Model to Scale (ECCV 2022)☆12Updated last year
- Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".☆15Updated last year
- ☆37Updated 7 months ago
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆36Updated last month
- ☆31Updated 3 months ago
- IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model☆18Updated last week
- Official repository for the paper "Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules" (ICLR 2023)☆12Updated last year
- Description and applications of OpenAI's paper about DALL-E (2021) and implementation of other (CLIP-guided) zero-shot text-to-image gene…☆29Updated 2 years ago
- ☆19Updated last year
- Official repo for the paper "Discffusion: Discriminative Diffusion Models as Few-shot Vision and Language Learners"☆26Updated 4 months ago
- Stay tuned!☆11Updated 5 months ago
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆30Updated 6 months ago
- Codes for ICML 2023 Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation☆35Updated last year
- Official Pytorch implementation for Distilling Image Classifiers in Object detection (NeurIPS2021)☆30Updated 2 years ago
- Accepted by AAAI2022☆21Updated 2 years ago
- ☆13Updated this week