CSHaitao / LexEval
LexEval: A Comprehensive Benchmark for Evaluating Large Language Models in Legal Domain
☆49Updated 2 months ago
Alternatives and similar repositories for LexEval:
Users that are interested in LexEval are comparing it to the libraries listed below
- Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models☆84Updated 9 months ago
- [IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection☆85Updated 8 months ago
- Codes and datasets for the paper Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Ref…☆34Updated 3 months ago
- 🌐 WebWaker: Benchmarking LLMs in Web Traversal☆54Updated this week
- The code and data of DPA-RAG☆54Updated 3 months ago
- Open source code of the paper: "OmniEval: An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain"☆44Updated 3 weeks ago
- Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning☆70Updated last year
- The demo, code and data of FollowRAG☆68Updated last month
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA☆108Updated 2 months ago
- Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs☆53Updated 3 months ago
- This is the code repo for our paper "Autonomously Knowledge Assimilation and Accommodation through Retrieval-Augmented Agents".☆102Updated 2 months ago
- Code for ICLR 2024 paper "CRAFT: Customizing LLMs by Creating and Retrieving from Specialized Toolsets"☆50Updated 7 months ago
- This is the repository for the generative information retrieval survey.☆144Updated last month
- Code and data for CoachLM, an automatic instruction revision approach LLM instruction tuning.☆60Updated 9 months ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆129Updated 2 months ago
- [Preprint] Learning to Filter Context for Retrieval-Augmented Generaton☆187Updated 9 months ago
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆64Updated last month
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆66Updated 2 weeks ago
- A framework for editing the CoTs for better factuality☆47Updated last year
- Code for the EMNLP 2024 paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps"☆118Updated 5 months ago
- Implementation of the paper: "Making Retrieval-Augmented Language Models Robust to Irrelevant Context"☆65Updated 5 months ago
- Code for Search-in-the-Chain: Towards Accurate, Credible and Traceable Large Language Models for Knowledge-intensive Tasks☆50Updated 9 months ago
- RECOMP: Improving Retrieval-Augmented LMs with Compression and Selective Augmentation.☆117Updated 6 months ago
- ☆27Updated 2 months ago
- ☆69Updated last year
- This is the code repo for our paper "RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards".☆21Updated last month
- [EMNLP2024] Aligning Large Language Models on Information Extraction☆37Updated 2 months ago
- ☆136Updated last year
- Code Repo for EfficientRAG: Efficient Retriever for Multi-Hop Question Answering☆36Updated 2 months ago
- [ACL 2024] ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis.☆55Updated 3 weeks ago