FoundationVision / Groma

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
551Updated 9 months ago

Alternatives and similar repositories for Groma:

Users that are interested in Groma are comparing it to the libraries listed below