DataSciBench: An LLM Agent Benchmark for Data Science
☆57Jan 21, 2026Updated 3 months ago
Alternatives and similar repositories for DataSciBench
Users that are interested in DataSciBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [EMNLP 2025] Code for paper "Table-R1: Inference-Time Scaling for Table Reasoning"☆29Jun 3, 2025Updated 10 months ago
- ☆53Aug 24, 2025Updated 8 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Jun 10, 2024Updated last year
- ☆34Mar 21, 2026Updated last month
- Code and Data for "MIRAI: Evaluating LLM Agents for Event Forecasting"☆96Jul 2, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Reproducible Language Agent Research☆35Jun 25, 2025Updated 10 months ago
- The code and datasets of our ACM MM 2024 paper "Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed …☆11Sep 27, 2024Updated last year
- ☆10Mar 13, 2023Updated 3 years ago
- Augmenting Statistical Models with Natural Language Parameters☆28Sep 17, 2024Updated last year
- 多语言降噪预训练模型MBart的中文生成任务☆11May 27, 2021Updated 4 years ago
- Reproducing R1 for Code with Reliable Rewards☆12Apr 9, 2025Updated last year
- ☆10May 18, 2023Updated 2 years ago
- Code for EMNLP 2021 Paper "Recall and Learn: A Memory-augmented Solver for Math Word Problems".☆16Oct 20, 2022Updated 3 years ago
- Lightweight auxiliary python framework for writing object-oriented Dash code.☆13Feb 20, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval☆16Jan 16, 2024Updated 2 years ago
- ☆33Dec 13, 2020Updated 5 years ago
- Code Repository for Excel VBA Programming - The Complete Guide, published by Packt☆14Jan 30, 2023Updated 3 years ago
- ☆15Apr 6, 2026Updated 3 weeks ago
- ☆14Jul 22, 2021Updated 4 years ago
- Official implementation of the paper: "A deeper look at depth pruning of LLMs"☆15Jul 24, 2024Updated last year
- SciGLM: Training Scientific Language Models with Self-Reflective Instruction Annotation and Tuning (NeurIPS D&B Track 2024)☆86Feb 25, 2024Updated 2 years ago
- [NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"☆18Oct 1, 2024Updated last year
- [NeurIPS 2024] OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI☆106Mar 6, 2025Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆16Oct 13, 2020Updated 5 years ago
- AlphaPy Pro☆42Apr 24, 2026Updated last week
- Minimal yet Awesome Multiboxing Assistant - WoW addon☆14Aug 6, 2025Updated 8 months ago
- ☆20Mar 19, 2025Updated last year
- R Package for the Interactive Shiny Application for exploratory data analysis thru visualization☆12Jan 11, 2020Updated 6 years ago
- AutoLibra: Metric Induction for Agents from Open-Ended Human Feedback☆17Apr 23, 2026Updated last week
- PyTorch implementation of paper "Evolving Parameterized Prompt Memory for Continual Learning" in AAAI 2024 (Oral).☆13Apr 15, 2024Updated 2 years ago
- A Collection of Public Recommender System Dataset☆11Feb 10, 2021Updated 5 years ago
- Code for paper Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding☆92Jun 18, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [TMM 2019] Official Implementation for Hierarchical User Intent Graph Network for Multimedia Recommendation☆10Apr 6, 2026Updated 3 weeks ago
- Control LLM☆23Apr 6, 2025Updated last year
- Spell Alerts for WoW Classic☆11Feb 3, 2026Updated 2 months ago
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆32Feb 26, 2026Updated 2 months ago
- I love to code, and my passion drives me to create things for everyone to enjoy. My goal is to enhance your gaming experience, and I hope…☆31Mar 1, 2026Updated last month
- Data and code for ACL 2023 paper "RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations"☆15Feb 8, 2024Updated 2 years ago
- [ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization☆12Jan 26, 2025Updated last year