microsoft / VisEvalLinks
A benchmark designed to evaluate visualization generation methods.
☆48Updated 3 months ago
Alternatives and similar repositories for VisEval
Users that are interested in VisEval are comparing it to the libraries listed below
Sorting:
- ☆88Updated last year
- Awesome-Paper-list: Visualization meets LLM☆49Updated last week
- InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks (ICML 2024)☆153Updated 4 months ago
- ☆67Updated 3 months ago
- Implementation of the MATRIX framework (ICML 2024)☆60Updated last year
- Official Implementation for EMNLP 2024 (main) "AgentReview: Exploring Academic Peer Review with LLM Agent."☆87Updated 10 months ago
- JudgeLRM: Large Reasoning Models as a Judge☆39Updated 3 weeks ago
- A research repo for experiments about Reinforcement Finetuning☆52Updated 6 months ago
- LLM for Scientific Research Survey☆104Updated 8 months ago
- [NeurIPS 2024 D&B Track] GTA: A Benchmark for General Tool Agents☆124Updated 6 months ago
- The implementation for ICLR 2025 Oral: From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions.☆47Updated 2 months ago
- ☆81Updated 4 years ago
- ☆38Updated last year
- [ICML'25 Oral] Multi-agent Architecture Search via Agentic Supernet☆182Updated 4 months ago
- [ICLR 2025] ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability via Chart-to-Code Generation☆123Updated 3 months ago
- [ACL'25 Main] ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation☆61Updated 2 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆130Updated 7 months ago
- [NeurIPS 2024] Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?☆131Updated last year
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios☆64Updated 2 months ago
- ☆54Updated 5 months ago
- [ICLR 2025] DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆76Updated last month
- Official Implementation of Dynamic LLM-Agent Network: An LLM-agent Collaboration Framework with Agent Team Optimization☆169Updated last year
- Open Source Implementation of Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evo…☆87Updated 2 months ago
- This is the official GitHub repository for our survey paper "Beyond Single-Turn: A Survey on Multi-Turn Interactions with Large Language …☆115Updated 4 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆82Updated 6 months ago
- Vega-Lite Chart Dataset and NL Generation Framework using LLMs☆130Updated last year
- [NeurIPS 2024 Oral] Aligner: Efficient Alignment by Learning to Correct☆186Updated 8 months ago
- Test-time preferenece optimization (ICML 2025).☆168Updated 5 months ago
- ☆127Updated last month
- ncNet is a Transformer-based model for supporting NL2VIS.☆44Updated last year