mesolitica / multimodal-LLM

Multi-Modal Language Modeling with Image, Audio and Text Integration, included multi-images and multi-audio in a single multiturn.
14Updated 9 months ago

Related projects

Alternatives and complementary repositories for multimodal-LLM