OpenGVLab / DiffAgentLinks

[CVPR 2024] DiffAgent: Fast and Accurate Text-to-Image API Selection with Large Language Model

☆17

Alternatives and similar repositories for DiffAgent

Users that are interested in DiffAgent are comparing it to the libraries listed below

Sorting:

HubHop / vit-attention-benchmark
Benchmarking Attention Mechanism in Vision Transformers.
☆18Updated 3 years ago
ggjy / vision_weak_to_strong
☆37Updated last year
donglixp / ICL_PaperList
Paper List for In-context Learning 🌷
☆20Updated 2 years ago
zju-vipa / training_free_model_merging
This repository is the implementation of the paper Training Free Pretrained Model Merging (CVPR2024).
☆30Updated last year
ziplab / SN-Netv2
[ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".
☆28Updated last year
Kwai-YuanQi / TaskGalaxy
Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types
☆31Updated 3 months ago
MengLcool / DeepStack-VL
[NeurIPS-24] This is the official implementation of the paper "DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effect…
☆61Updated last year
mightyzau / InfMLLM
☆19Updated last year
MarkXCloud / CSpD
The official repo of continuous speculative decoding
☆30Updated 7 months ago
savadikarc / wegeft
WeGeFT: Weight‑Generative Fine‑Tuning for Multi‑Faceted Efficient Adaptation of Large Models
☆21Updated 3 months ago
TencentARC / pi-Tuning
Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
☆33Updated 2 years ago
OliverRensu / DeepMIM
[WACV2025 Oral] DeepMIM: Deep Supervision for Masked Image Modeling
☆54Updated 5 months ago
SparksJoe / Prism
A Framework for Decoupling and Assessing the Capabilities of VLMs
☆43Updated last year
facebookresearch / ViP-MAE
This is a PyTorch implementation of the paperViP A Differentially Private Foundation Model for Computer Vision
☆36Updated 2 years ago
inclusionAI / M2-Reasoning
M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning
☆45Updated 3 months ago
KD-TAO / VidKV
VidKV: Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models
☆22Updated 7 months ago
Letian2003 / MM_INF
An efficient multi-modal instruction-following data synthesis tool and the official implementation of Oasis https://arxiv.org/abs/2503.08…
☆32Updated 4 months ago
OpenGVLab / LLMPrune-BESA
BESA is a differentiable weight pruning technique for large language models.
☆17Updated last year
zhangjiewu / awesome-t2i-eval
A curated list of papers and resources for text-to-image evaluation.
☆30Updated 2 years ago
wy1iu / OPT
Implementation for <Orthogonal Over-Parameterized Training> in CVPR'21.
☆22Updated 4 years ago
opendatalab / MLLM-DataEngine
MLLM-DataEngine: An Iterative Refinement Approach for MLLM
☆48Updated last year
V3Det / mmdetection-V3Det
OpenMMLab Detection Toolbox and Benchmark for V3Det
☆15Updated last year
ys-zong / MIRB
Benchmarking Multi-Image Understanding in Vision and Language Models
☆12Updated last year
LeapLabTHU / Deep-Incubation
Code release for Deep Incubation (https://arxiv.org/abs/2212.04129)
☆90Updated 2 years ago
linzhiqiu / CLIP-FlanT5
Training code for CLIP-FlanT5
☆30Updated last year
BAAI-DCAI / Dataset-Pruning
Dataset pruning for ImageNet and LAION-2B.
☆79Updated last year
TencentARC / FLM
Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
☆32Updated 2 years ago
changlin31 / AutoProg
(CVPR 2022) Automated Progressive Learning for Efficient Training of Vision Transformers
☆25Updated 8 months ago
shawnricecake / search-llm
[NeurIPS 2024] Search for Efficient LLMs
☆15Updated 9 months ago
kodenii / ImaginaryNet
ImaginaryNet: Learning Object Detectors without Real Images and Annotations
☆26Updated 2 years ago