ServiceNow / agent-poirotLinks
☆13Updated last month
Alternatives and similar repositories for agent-poirot
Users that are interested in agent-poirot are comparing it to the libraries listed below
Sorting:
- ☆42Updated 2 weeks ago
- Codebase accompanying the Summary of a Haystack paper.☆78Updated 8 months ago
- Code for LitLLMs, LLMs for Literature Review: Are we there yet? (TMLR 2025)☆30Updated last month
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆42Updated last week
- Official Code Repository for the paper "Distilling LLM Agent into Small Models with Retrieval and Code Tools"☆101Updated this week
- Source code for the collaborative reasoner research project at Meta FAIR.☆87Updated last month
- Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024)☆47Updated 4 months ago
- ☆17Updated 2 months ago
- Finding semantically meaningful and accurate prompts.☆46Updated last year
- DocBench: A Benchmark for Evaluating LLM-based Document Reading Systems☆35Updated 8 months ago
- ☆47Updated last year
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆106Updated 8 months ago
- Official implementation of the ACL 2024: Scientific Inspiration Machines Optimized for Novelty☆79Updated last year
- Google Research☆46Updated 2 years ago
- Official code release for the paper Coder Reviewer Reranking for Code Generation.☆43Updated 2 years ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆128Updated last year
- ☆23Updated 3 months ago
- PyTorch implementation for MRL☆18Updated last year
- Synthetic Data Generation for Evaluation☆14Updated 3 months ago
- Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.☆40Updated 2 months ago
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆92Updated this week
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆84Updated 9 months ago
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆68Updated 6 months ago
- MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]☆162Updated 5 months ago
- WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?☆188Updated last month
- Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025.☆62Updated last month
- Codes and datasets for the paper Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Ref…☆57Updated 3 months ago
- Efficient multi-prompt evaluation of LLMs☆19Updated 6 months ago
- Evaluation of neuro-symbolic engines☆35Updated 10 months ago
- LOFT: A 1 Million+ Token Long-Context Benchmark☆198Updated last month