RUC-NLPIR / OmniEvalLinks
Open source code of the paper: "OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain"
☆62Updated 5 months ago
Alternatives and similar repositories for OmniEval
Users that are interested in OmniEval are comparing it to the libraries listed below
Sorting:
- The demo, code and data of FollowRAG☆72Updated last month
- CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation☆52Updated last week
- Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs (ACL 2024)☆66Updated 3 weeks ago
- The code and data of DPA-RAG, accepted by WWW 2025 main conference.☆61Updated 4 months ago
- IKEA: Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent☆54Updated 2 weeks ago
- ☆46Updated 3 months ago
- ☆57Updated 7 months ago
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆125Updated last week
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆68Updated 2 weeks ago
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.☆42Updated 11 months ago
- ☆102Updated 5 months ago
- AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reaso…☆107Updated 2 months ago
- [ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.☆63Updated 7 months ago
- Reformatted Alignment☆113Updated 8 months ago
- The official repo for our paper: LegalAgentBench: Evaluating LLM Agents in Legal Domainl☆24Updated 5 months ago
- This is the code repo for our paper "Autonomously Knowledge Assimilation and Accommodation through Retrieval-Augmented Agents".☆106Updated 7 months ago
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA☆131Updated 6 months ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆141Updated 7 months ago
- SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis☆55Updated last week
- [ICLR 2025] This is the code repo for our ICLR’25 paper "RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rew…☆38Updated 3 months ago
- Code implementation of synthetic continued pretraining☆110Updated 4 months ago
- ☆19Updated last year
- Test-time compute in information retrieval☆28Updated last month
- The official code of paper “Tool-Star: Empowering LLM-brained Multi-Tool Reasoner via Reinforcement Learning”☆99Updated this week
- [ACL'25] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆50Updated last week
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆49Updated 5 months ago
- [ICLR 2025] InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales☆97Updated 3 months ago
- ☆23Updated 2 weeks ago
- ☆56Updated 7 months ago
- [EMNLP2024] Aligning Large Language Models on Information Extraction☆47Updated 6 months ago