mbzuai-oryx / AINLinks

AIN - The First Arabic Inclusive Large Multimodal Model. It is a versatile bilingual LMM excelling in visual and contextual understanding across diverse domains.

☆49

Alternatives and similar repositories for AIN

Users that are interested in AIN are comparing it to the libraries listed below

Sorting:

mbzuai-oryx / Camel-Bench
[NAACL 2025 🔥] CAMEL-Bench is an Arabic benchmark for evaluating multimodal models across eight domains with 29,000 questions.
☆34Updated 7 months ago
mbzuai-oryx / KITAB-Bench
[ACL 2025 🔥] A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding
☆60Updated 6 months ago
mbzuai-oryx / BiMediX2
Bio-Medical EXpert LMM with English and Arabic Language Capabilities
☆71Updated last month
mbzuai-oryx / ALM-Bench
[CVPR 2025 🔥] ALM-Bench is a multilingual multi-modal diverse cultural benchmark for 100 languages across 19 categories. It assesses the…
☆45Updated 6 months ago
sayedmohamedscu / Vision-language-models-VLM
vision language models finetuning notebooks & use cases (Medgemma - paligemma - florence .....)
☆56Updated last month
mbzuai-oryx / ClimateGPT
[EMNLP'23] ClimateGPT: a specialized LLM for conversations related to Climate Change and Sustainability topics in both English and Arabi…
☆79Updated last year
mbzuai-oryx / PALO
(WACV 2025 - Oral) Vision-language conversation in 10 languages including English, Chinese, French, Spanish, Russian, Japanese, Arabic, H…
☆84Updated 3 months ago
mbzuai-oryx / BiMediX
Bilingual Medical Mixture of Experts LLM
☆31Updated last year
BioMedIA-MBZUAI / SALT
This repository contains the official source code for SALT: Parameter-Efficient Fine-Tuning via Singular Value Adaptation with Low-Rank T…
☆27Updated 4 months ago
lucaro / VBS-Archive
Archive of Tasks and Results of the Video Browser Showdown
☆13Updated 8 months ago
Muhammad-Ibraheem-Siddiqui / PerSense
[BMVC 2025] Official Implementation of the paper "PerSense: Personalized Instance Segmentation in Dense Images"
☆27Updated 2 months ago
Farzad-R / Finetune-LLAVA-NEXT
This repository contains codes for fine-tuning LLAVA-1.6-7b-mistral (Multimodal LLM) model.
☆40Updated last year
chongzhangFDU / ROOR
This is the official implementation to the EMNLP 2024 paper: Modeling Layout Reading Order as Ordering Relations for Visually-rich Docume…
☆28Updated last year
qhnhynmm / ViOCRVQA-Dataset
The largest VQA dataset for Vietnamese. Related to the text content in the image.
☆21Updated 7 months ago
alexander-moore / vlm
Composition of Multimodal Language Models From Scratch
☆15Updated last year
abachaa / MEDEC
☆37Updated 6 months ago
Hon-Wong / VoRA
[Fully open] [Encoder-free MLLM] Vision as LoRA
☆349Updated 5 months ago
fangyuan-ksgk / Mini-LLaVA
A minimal implementation of LLaVA-style VLM with interleaved image & text & video processing ability.
☆96Updated 11 months ago
mbzuai-oryx / ARB
ARB: A Comprehensive Arabic Multimodal Reasoning Benchmark
☆17Updated 6 months ago
mbzuai-oryx / TimeTravel
[ACL 2025 🔥] Time Travel is a Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts
☆18Updated 6 months ago
BioMedIA-MBZUAI / MedPromptX
☆70Updated 4 months ago
2U1 / SmolVLM-Finetune
An open-source implementaion for fine-tuning SmolVLM.
☆58Updated 2 months ago
MohamedAliRashad / arabic-nougat
Code for Arabic Nougat
☆49Updated last year
akhtarvision / cal-detr
☆42Updated 2 years ago
dali92002 / SSL-OCR
Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement - AAAI 2023
☆28Updated 2 years ago
UBC-NLP / peacock
This is the official repository for Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks.
☆26Updated 11 months ago
ariG23498 / fine-tune-paligemma
Notebooks for fine tuning pali gemma
☆117Updated 7 months ago
ECOFRI / CXR_LLaVA
☆50Updated last year
CAMeL-Lab / arabic-gec
Code, models, and data for "Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation". EMNLP 2023.
☆17Updated last year
ariG23498 / mmdp
☆29Updated 4 months ago