1171-jpg / MARVEL_AVRLinks
Github repo for MARVEL: Multidimensional Abstraction and Reasoning through Visual Evaluation and Learning
☆16Updated last year
Alternatives and similar repositories for MARVEL_AVR
Users that are interested in MARVEL_AVR are comparing it to the libraries listed below
Sorting:
- Official Repo for MageBench: Bridging Large Multimodal Models to Agents☆21Updated 5 months ago
- Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆36Updated last week
- ☆38Updated 6 months ago
- Unofficial Implementation of Chain-of-Thought Reasoning Without Prompting☆32Updated last year
- Setup scripts for the WebArena benchmark☆11Updated last week
- [NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI☆102Updated 3 months ago
- The official github repo for the open online courses: "Dive into LLMs".☆10Updated last year
- [NeurIPS 2024] A task generation and model evaluation system for multimodal language models.☆71Updated 7 months ago
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆24Updated 9 months ago
- Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073☆28Updated 11 months ago
- AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time☆72Updated 2 weeks ago
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆12Updated this week
- ☆112Updated this week
- Official code and dataset for our EMNLP 2024 Findings paper: Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Kn…☆19Updated 6 months ago
- X-Reasoner: Towards Generalizable Reasoning Across Modalities and Domains☆46Updated last month
- [Under Review] Official PyTorch implementation code for realizing the technical part of Phantom of Latent representing equipped with enla…☆60Updated 8 months ago
- ☆10Updated 5 months ago
- Official Implementation of ARPO: End-to-End Policy Optimization for GUI Agents with Experience Replay☆79Updated last month
- Large Language Models Can Self-Improve in Long-context Reasoning☆70Updated 7 months ago
- ☆60Updated 4 months ago
- Sotopia-π: Interactive Learning of Socially Intelligent Language Agents (ACL 2024)☆65Updated last year
- Official code and dataset for our NAACL 2024 paper: DialogCC: An Automated Pipeline for Creating High-Quality Multi-modal Dialogue Datase…☆13Updated last year
- 🔥 Omni large models and datasets for understanding and generating multi-modalities.☆15Updated 8 months ago
- Extensive Self-Contrast Enables Feedback-Free Language Model Alignment☆21Updated last year
- A Comprehensive Benchmark for Robust Multi-image Understanding☆11Updated 9 months ago
- ☆29Updated 2 months ago
- Official implementation of Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning☆16Updated 7 months ago
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆57Updated 8 months ago
- [NeurIPS 2024] A comprehensive benchmark for evaluating critique ability of LLMs☆39Updated 7 months ago
- SWE-Factory: Your Automated Factory for Issue Resolution Training Data and Evaluation Benchmarks☆55Updated last week