InternLM / AlchemistCoderLinks
☆35Updated last year
Alternatives and similar repositories for AlchemistCoder
Users that are interested in AlchemistCoder are comparing it to the libraries listed below
Sorting:
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Updated last year
- ☆75Updated last year
- Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs☆92Updated last year
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆109Updated 5 months ago
- Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models☆63Updated last year
- ☆29Updated last year
- Image Textualization: An Automatic Framework for Generating Rich and Detailed Image Descriptions (NeurIPS 2024)☆168Updated last year
- DELT: Data Efficacy for Language Model Training☆42Updated 2 months ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆47Updated 8 months ago
- ☆36Updated last year
- [NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of…☆117Updated 11 months ago
- An benchmark for evaluating the capabilities of large vision-language models (LVLMs)☆45Updated last year
- The official repo for "VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search" [EMNLP25]☆35Updated 2 months ago
- The SAIL-VL2 series model developed by the BytedanceDouyinContent Group☆75Updated last month
- ☆74Updated 4 months ago
- ☆66Updated last year
- Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"☆60Updated last year
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆118Updated 5 months ago
- Multimodal Open-O1 (MO1) is designed to enhance the accuracy of inference models by utilizing a novel prompt-based approach. This tool wo…☆29Updated last year
- ✨✨Beyond LLaVA-HD: Diving into High-Resolution Large Multimodal Models☆162Updated 10 months ago
- Official completion of “Training on the Benchmark Is Not All You Need”.☆37Updated 10 months ago
- ☆90Updated last year
- [NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs☆128Updated 6 months ago
- (ICLR 2025) The Official Code Repository for GUI-World.☆67Updated 10 months ago
- ZeroGUI: Automating Online GUI Learning at Zero Human Cost☆100Updated 3 months ago
- MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment☆35Updated last year
- helper functions for processing and integrating visual language information with Qwen-VL Series Model☆15Updated last year
- The code and data of We-Math, accepted by ACL 2025 main conference.☆133Updated 3 weeks ago
- ☆50Updated 2 years ago
- Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models☆91Updated last year