SALT-NLP / DARG
The official repo for DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph
☆15Updated 6 months ago
Alternatives and similar repositories for DARG:
Users that are interested in DARG are comparing it to the libraries listed below
- Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs☆20Updated 6 months ago
- The source code for running LLMs on the AAAR-1.0 benchmark.☆16Updated last month
- AbstainQA, ACL 2024☆25Updated 7 months ago
- Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024)☆47Updated 3 months ago
- Evaluate the Quality of Critique☆34Updated 11 months ago
- Code for "Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective"☆32Updated last year
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Updated last year
- [ICLR 2024] Unveiling the Pitfalls of Knowledge Editing for Large Language Models☆22Updated 10 months ago
- Code/data for MARG (multi-agent review generation)☆43Updated 5 months ago
- ☆14Updated last year
- ☆42Updated 9 months ago
- SRTK: Retrieve semantic-relevant subgraphs from large-scale knowledge graphs☆27Updated 7 months ago
- Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval☆45Updated 6 months ago
- [NAACL 2025] The official implementation of paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language M…☆26Updated last year
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆68Updated last year
- This is the implementation for the paper "LARGE LANGUAGE MODEL CASCADES WITH MIX- TURE OF THOUGHT REPRESENTATIONS FOR COST- EFFICIENT REA…☆21Updated 11 months ago
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆110Updated 3 weeks ago
- ☆23Updated 11 months ago
- ☆24Updated 3 months ago
- Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments (EMNLP'2024)☆36Updated 4 months ago
- Process Reward Models That Think☆30Updated this week
- [ICML 2025] Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆85Updated last month
- This is the code of MMOA-RAG.☆51Updated last month
- Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data☆35Updated 2 months ago
- ☆41Updated last year
- [arXiv preprint] Official Repository for "Evaluating Language Models as Synthetic Data Generators"☆33Updated 4 months ago
- This is the official repo for Towards Uncertainty-Aware Language Agent.☆24Updated 8 months ago
- The repository for ACL 2024 paper "TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models"☆31Updated 10 months ago
- ✨ Resolving Knowledge Conflicts in Large Language Models, COLM 2024☆16Updated 7 months ago
- ☆10Updated 5 months ago