songrise/MLLM4Art

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/songrise/MLLM4Art)

songrise / MLLM4Art

[ACM MM 2025] MLLMs for Aesthetics Reasoning

☆26

Alternatives and similar repositories for MLLM4Art

Users that are interested in MLLM4Art are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

BestiVictory / DPC-Captions
View on GitHub
A image caption dataset about images from www.dpchallenge.com.
☆20Dec 12, 2019Updated 6 years ago
Dreemurr-T / BAID
View on GitHub
☆101Nov 21, 2023Updated 2 years ago
w3yyb / awesome-ubuntu
View on GitHub
使用ubuntu来工作,ubuntu安装,使用,配置指南
☆10Mar 23, 2023Updated 3 years ago
yipoh / TAVAR
View on GitHub
[TCSVT] Theme-aware Visual Attribute Reasoning for Image Aesthetics Assessment
☆23Apr 10, 2023Updated 3 years ago
KoreTeknology / ComfyUI-Nai-Production-Nodes-Pack
View on GitHub
A set of Custom Nodes for Compositing for ComfyUI
☆16Nov 24, 2024Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
yipoh / AesExpert
View on GitHub
[ACMMM 2024] AesExpert: Towards Multi-modality Foundation Model for Image Aesthetics Perception
☆105Jan 19, 2025Updated last year
hobart07 / Step1X-Edit_train
View on GitHub
☆14May 20, 2025Updated last year
Anne-SofieMaerten / LAPIS
View on GitHub
☆20Apr 11, 2025Updated last year
SLIT-AI / WRPO
View on GitHub
[ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion
☆14Mar 17, 2025Updated last year
BestiVictory / CJS-CNN
View on GitHub
☆10May 6, 2018Updated 8 years ago
woshidandan / IAA_Tutorial
View on GitHub
实验室【外部】美学课题组入门学习材料，加入课题组后，会有更详细的内部学习资料。
☆87May 15, 2026Updated 2 months ago
txsun1997 / Multi-Task-Learning-using-Uncertainty-to-Weigh-Losses
View on GitHub
☆12Mar 18, 2019Updated 7 years ago
desaixie / gait
View on GitHub
Official code for ICCV 2023 paper: GAIT: Generating Aesthetic Indoor Tours with Deep Reinforcement Learning
☆12Dec 31, 2023Updated 2 years ago
alienzhou / giframe
View on GitHub
extract the first frame in GIF without reading whole bytes, support both browser and nodejs 📸
☆24Feb 3, 2020Updated 6 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
WeihuangLin / INF-LLaVA
View on GitHub
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model
☆42Aug 4, 2024Updated last year
Sueqk / LMM-VQA
View on GitHub
LMM for VQA, tcsvt version
☆10Jul 19, 2024Updated 2 years ago
gy8888 / RelationAdapter
View on GitHub
Code Implementation of “RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers”
☆33Apr 13, 2026Updated 3 months ago
liujin112 / ZePo
View on GitHub
[ACM MM24] Official implementation of ACM MM 2024 paper: "ZePo: Zero-Shot Portrait Stylization with Faster Sampling"
☆43Aug 22, 2024Updated last year
yipoh / AesBench
View on GitHub
An expert benchmark aiming to comprehensively evaluate the aesthetic perception capacities of MLLMs.
☆261Feb 4, 2025Updated last year
klauscc / DAM
View on GitHub
Official code for DAM: Dynamic Adapter Merging for Continual Video QA Learning
☆15Apr 25, 2024Updated 2 years ago
yale-nlp / refdpo
View on GitHub
☆16Jul 23, 2024Updated 2 years ago
TamashaM / NAPA-VQ
View on GitHub
☆11Jul 4, 2024Updated 2 years ago
Yikai-Wang / SEELE-ReS
View on GitHub
This repo contains the proposed dataset ReS and code in our paper Repositioning The Subject Within Image
☆28Nov 26, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
uynaes / RankingAwareCLIP
View on GitHub
[ICLR'25] Official repository of paper: Ranking-aware adapter for text-driven image ordering with CLIP
☆16Apr 17, 2025Updated last year
jiaangli / VILA
View on GitHub
[TACL/EMNLP'24] Do Vision and Language Models Share Concepts? A Vector Space Alignment Study
☆16Nov 22, 2024Updated last year
coriverchen / ProvablySecureSteganography
View on GitHub
☆15Jun 26, 2023Updated 3 years ago
gmum / HyperNeRFGAN
View on GitHub
Generative model for 3D objects.
☆18Aug 12, 2023Updated 2 years ago
matthias-wright / art-fid
View on GitHub
ArtFID: Quantitative Evaluation of Neural Style Transfer
☆72Jul 17, 2024Updated 2 years ago
xyfJASON / ctrlora
View on GitHub
[ICLR 2025] Codebase for "CtrLoRA: An Extensible and Efficient Framework for Controllable Image Generation"
☆268Mar 6, 2026Updated 4 months ago
showlab / TPDiff
View on GitHub
TPDiff: Temporal Pyramid Video Diffusion Model
☆25Mar 13, 2025Updated last year
a-nagrani / CVPR2020_Poster
View on GitHub
Speech2Action CVPR Poster Source Code
☆20Apr 29, 2020Updated 6 years ago
satoshi-kosugi / PG-IA-NILUT
View on GitHub
☆19Aug 21, 2024Updated last year
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
taco-group / COVER
View on GitHub
🏆 [CVPRW 2024] COVER: A Comprehensive Video Quality Evaluator. 🥇 Winner solution for Video Quality Assessment Challenge at the 1st AIS…
☆100Jul 18, 2024Updated 2 years ago
gjwang / bittorrent
View on GitHub
for bittorrent test
☆15May 22, 2014Updated 12 years ago
blepping / comfyui_jankdiffusehigh
View on GitHub
Janky implementation of DiffuseHigh for ComfyUI
☆37May 6, 2025Updated last year
ActivityForensics / activityforensics
View on GitHub
[CVPR 2026] ActivityForensics: ActivityForensics: A Comprehensive Benchmark for Localizing Manipulated Activity in Videos
☆23Apr 27, 2026Updated 3 months ago
lucasjinreal / LLaVA-Magvit2
View on GitHub
LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.
☆38Jun 20, 2024Updated 2 years ago
Liuxinyv / HiPrompt
View on GitHub
[IJCV 2026] HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts
☆26Feb 28, 2025Updated last year
Aaron617 / text2world
View on GitHub
[ACL 2025 Findings] Text2World: Benchmarking Large Language Models for Symbolic World Model Generation
☆29Feb 25, 2025Updated last year