☆43May 29, 2025Updated 10 months ago
Alternatives and similar repositories for metal
Users that are interested in metal are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICLR 2025] ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability via Chart-to-Code Generation☆131Dec 19, 2025Updated 4 months ago
- [ACL 2025] Exploring Compositional Generalization of Multimodal LLMs for Medical Imaging☆39Jun 4, 2025Updated 10 months ago
- Code for Our EMNLP (Industry) 2023 paper "LLM4Vis: Explainable Visualization Recommendation using ChatGPT"☆29Feb 4, 2024Updated 2 years ago
- A Framework for Evaluating AI Agent Safety in Realistic Environments☆31Oct 2, 2025Updated 6 months ago
- ☆29Nov 9, 2025Updated 5 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Code for ExploreTom☆93Jun 25, 2025Updated 9 months ago
- ☆13Jan 22, 2025Updated last year
- [NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs☆145Apr 22, 2025Updated 11 months ago
- ☆23Aug 17, 2024Updated last year
- ☆16Jan 28, 2026Updated 2 months ago
- Official Code for "Learning to Reason via Mixture-of-Thought for Logical Reasoning"☆28Nov 20, 2025Updated 4 months ago
- An official repository for GPTailor☆17Jun 29, 2025Updated 9 months ago
- Example for exposing MCP servers to Pydantic Agents☆18Mar 16, 2025Updated last year
- GenExam: A Multidisciplinary Text-to-Image Exam☆64Mar 29, 2026Updated 3 weeks ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A collection of AWESOME language modeling techniques on tabular data applications.☆32Oct 14, 2024Updated last year
- ☆10Mar 13, 2023Updated 3 years ago
- Korean Benchmark for Korean Legal Language Understanding☆18Nov 16, 2024Updated last year
- OpenVLThinker: An Early Exploration to Vision-Language Reasoning via Iterative Self-Improvement☆139Apr 11, 2026Updated last week
- This is the repository for paper EscapeBench: Pushing Language Models to Think Outside the Box☆18Dec 19, 2024Updated last year
- ☆45Nov 1, 2025Updated 5 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆113Mar 26, 2026Updated 3 weeks ago
- A LLM Multi-Agent Framework toward Ultra Large-Scale Code Generation and Optimization☆17Dec 22, 2024Updated last year
- ☆48Apr 7, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ViDRiP-LLaVA: A Dataset and Benchmark for Diagnostic Reasoning from Pathology Videos☆23May 21, 2025Updated 10 months ago
- Self-Weighted Contrastive Learning among Multiple Views for Mitigating Representation Degeneration☆11Oct 23, 2023Updated 2 years ago
- Measuring and Controlling Persona Drift in Language Model Dialogs☆24Feb 26, 2024Updated 2 years ago
- Official implementation of "What does CLIP know about a red circle? Visual Prompt Engineering for VLMs", ICCV 2023☆11Sep 21, 2023Updated 2 years ago
- ☆147May 23, 2024Updated last year
- ToMATO: Verbalizing the Mental States of Role-Playing LLMs for Benchmarking Theory of Mind (AAAI2025)☆19Apr 16, 2025Updated last year
- M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning☆46Jul 17, 2025Updated 9 months ago
- AutoLibra: Metric Induction for Agents from Open-Ended Human Feedback☆17Oct 15, 2025Updated 6 months ago
- ☆18Mar 19, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆99May 16, 2025Updated 11 months ago
- A comprehensive benchmark for evaluating deep research agents on academic survey tasks☆50Sep 4, 2025Updated 7 months ago
- Complex-Edit: CoT-Like Instruction Generation for Complexity-Controllable Image Editing Benchmark☆28Apr 22, 2025Updated 11 months ago
- Hate speech detection corpus in Korean, shared with EMNLP 2023 paper☆17Apr 19, 2024Updated 2 years ago
- Official repo for "All Roads Lead to Rome: Incentivizing Divergent Thinking in Vision-Language Models" (CVPR 2026)☆53Apr 9, 2026Updated last week
- A simple visual test-time scaling method for GUI agent grounding☆23Dec 7, 2025Updated 4 months ago
- Official code of paper "Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models"☆87May 27, 2025Updated 10 months ago