Kartik-3004 / facexbench
FaceXBench: Evaluating Multimodal LLMs on Face Understanding
☆14Updated 2 months ago
Alternatives and similar repositories for facexbench:
Users that are interested in facexbench are comparing it to the libraries listed below
- [NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"☆16Updated 6 months ago
- INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model☆42Updated 8 months ago
- ☆50Updated this week
- Official Repository of Personalized Visual Instruct Tuning☆28Updated last month
- The official repo of continuous speculative decoding☆24Updated 3 weeks ago
- OLA-VLM: Elevating Visual Perception in Multimodal LLMs with Auxiliary Embedding Distillation, arXiv 2024☆58Updated last month
- ☆34Updated last year
- Official Pytorch Implementation of Self-emerging Token Labeling☆33Updated last year
- This repository provides an improved LLamaGen Model, fine-tuned on 500,000 high-quality images, each accompanied by over 300 token prompt…☆30Updated 5 months ago
- ☆45Updated 3 months ago
- Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment☆48Updated 3 months ago
- Code for the paper "Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers"☆60Updated last month
- ☆17Updated 5 months ago
- 🤖 [ICLR'25] Multimodal Video Understanding Framework (MVU)☆32Updated 2 months ago
- PyTorch code for "ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning"☆20Updated 5 months ago
- ☆41Updated last year
- Official implementation of Next Block Prediction: Video Generation via Semi-Autoregressive Modeling☆30Updated 2 months ago
- Diffusion Models as Data Mining Tools☆53Updated last month
- ☆43Updated last year
- [NeurIPS 2024] Official PyTorch Implementation of "FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner"☆66Updated 6 months ago
- we propose FlexEdit, an end-to-end image editing method that leverages both free-shape masks and language instructions for Flexible Editi…☆30Updated 7 months ago
- Official implementation for "Diffusion Instruction Tuning"☆21Updated 2 months ago
- Official Pytorch Implementation of Paper "A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Des…☆55Updated 9 months ago
- [IJCV 2024] MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation☆121Updated 6 months ago
- ☆33Updated last year
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆71Updated 3 weeks ago
- Official implementation of Add-SD: Rational Generation without Manual Reference.☆27Updated 7 months ago
- Codebase for the paper-Elucidating the design space of language models for image generation☆45Updated 5 months ago
- Distilling Diversity and Control in Diffusion Models☆37Updated 3 weeks ago
- [NeurIPS 2024] EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models.☆47Updated 6 months ago