stanfordmlgroup / ManyICL
☆116Updated 3 months ago
Related projects: ⓘ
- E5-V: Universal Embeddings with Multimodal Large Language Models☆148Updated 2 months ago
- Official code for Paper "Mantis: Multi-Image Instruction Tuning"☆158Updated last week
- A task generation and model evaluation system.☆51Updated last week
- Codes for Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models☆93Updated last month
- Official implementation of MAIA, A Multimodal Automated Interpretability Agent☆56Updated last month
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture☆115Updated 2 weeks ago
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆47Updated last month
- Code and example data for the paper: Rule Based Rewards for Language Model Safety☆131Updated 2 months ago
- [CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(…☆220Updated 6 months ago
- This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for E…☆323Updated last week
- An LLM-free Multi-dimensional Benchmark for Multi-modal Hallucination Evaluation☆85Updated 8 months ago
- Public code repo for paper "A Single Transformer for Scalable Vision-Language Modeling"☆103Updated last month
- ☆136Updated 7 months ago
- Code accompanying the paper "Massive Activations in Large Language Models"☆104Updated 6 months ago
- Towards Large Multimodal Models as Visual Foundation Agents☆87Updated 3 weeks ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆89Updated 4 months ago
- Official implementation for the paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention M…☆95Updated last month
- Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models☆67Updated this week
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆55Updated last week
- ICML'2024 | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI☆84Updated 2 months ago
- ControlLLM: Augment Language Models with Tools by Searching on Graphs☆184Updated 2 months ago
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆82Updated 2 months ago
- Multimodal language model benchmark, featuring challenging examples☆144Updated last month
- MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.☆53Updated 2 weeks ago
- This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.or…☆100Updated 2 months ago
- Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)☆92Updated 2 months ago
- ☆145Updated 2 months ago
- Python Library to evaluate VLM models' robustness across diverse benchmarks☆162Updated 2 weeks ago
- Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning☆93Updated 2 months ago
- ☆36Updated last month