RUCKBReasoning / SpreadsheetBench
SpreadsheetBench: Towards Challenging Real World Spreadsheet Manipulation
☆17Updated 5 months ago
Alternatives and similar repositories for SpreadsheetBench:
Users that are interested in SpreadsheetBench are comparing it to the libraries listed below
- Code for paper Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding☆63Updated 9 months ago
- RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation.☆125Updated 9 months ago
- ☆44Updated 3 months ago
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆94Updated last month
- [Neurips2024] Source code for xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token☆131Updated 9 months ago
- This is the code repo for our paper "Autonomously Knowledge Assimilation and Accommodation through Retrieval-Augmented Agents".☆103Updated 5 months ago
- [NeurIPS 2024] Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?☆120Updated 7 months ago
- The code of arxiv paper: "CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis"☆23Updated 2 months ago
- Open source code of the paper: "OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain"☆54Updated 3 months ago
- [NAACL'24] Dataset, code and models for "TableLlama: Towards Open Large Generalist Models for Tables".☆127Updated 10 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- Watch Every Step! LLM Agent Learning via Iterative Step-level Process Refinement (EMNLP 2024 Main Conference)☆57Updated 5 months ago
- Code for ICLR 2024 paper "CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets"☆52Updated 10 months ago
- The official GitHub repository for TC-RAG (Turing-Complete RAG)☆51Updated last month
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆132Updated 5 months ago
- Implementation of the paper: "Making Retrieval-Augmented Language Models Robust to Irrelevant Context"☆67Updated 7 months ago
- [ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales☆81Updated last month
- [NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't…☆109Updated 8 months ago
- 🌲 Code for our EMNLP 2023 paper - 🎄 "Tree of Clarifications: Answering Ambiguous Questions with Retrieval-Augmented Large Language Mode…☆48Updated last year
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA☆119Updated 4 months ago
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆65Updated 4 months ago
- The official Github repository for paper "R^2AG: Incorporating Retrieval Information into Retrieval Augmented Generation" (EMNLP 2024 Fin…☆30Updated 3 months ago
- [ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement☆181Updated last year
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆48Updated last month
- Benchmark baseline for retrieval qa applications☆106Updated 11 months ago
- The GitHub repository for the paper "Self-prompted Chain-of-Thought on Large Language Models for Open-domain Multi-hop Reasoning" accepte…☆18Updated last year
- Code implementation of synthetic continued pretraining☆97Updated 2 months ago
- "Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases" b…☆42Updated last year
- ☆101Updated 3 months ago
- PGRAG☆48Updated 8 months ago