ServiceNow / agent-poirotLinks

☆14

Alternatives and similar repositories for agent-poirot

Users that are interested in agent-poirot are comparing it to the libraries listed below

Sorting:

ekinakyurek / google-research
Google Research
☆46Updated 2 years ago
UW-Madison-Lee-Lab / LanguageInterfacedFineTuning
Code for Language-Interfaced FineTuning for Non-Language Machine Learning Tasks.
☆129Updated 8 months ago
javyduck / KnowHalu
☆48Updated last year
facebookresearch / coder_reviewer_reranking
Official code release for the paper Coder Reviewer Reranking for Code Generation.
☆45Updated 2 years ago
qcri / LLMeBench
Benchmarking Large Language Models
☆98Updated last month
jaehunjung1 / cascaded-selective-evaluation
☆25Updated 5 months ago
allenai / discoverybench
Discovering Data-driven Hypotheses in the Wild
☆102Updated last month
jmerullo / lm_vector_arithmetic
☆35Updated 2 years ago
felipemaiapolo / prompteval
Efficient multi-prompt evaluation of LLMs
☆21Updated 7 months ago
ServiceNow / insight-bench
☆46Updated last week
LitLLM / litllms-for-literature-review-tmlr
Code for LitLLMs, LLMs for Literature Review: Are we there yet? (TMLR 2025)
☆33Updated 3 months ago
poloclub / LLM-Attributor
LLM Attributor: Attribute LLM's Generated Text to Training Data
☆52Updated last year
michael-aloys / awesome-weak-supervision
☆40Updated last year
mlfoundations / rtfm
Research on Tabular Foundation Models
☆53Updated 7 months ago
jxbz / entropix
📰 Computing the information content of trained neural networks
☆21Updated 3 years ago
HishamAlyahya / semantic_backprop
Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖
☆72Updated 7 months ago
cleanlab / multiannotator-benchmarks
Benchmarking algorithms for assessing quality of data labeled by multiple annotators
☆32Updated 2 years ago
viswavi / few-shot-clustering
☆78Updated 9 months ago
ulab-uiuc / MARBLE
(ACL 2025 Main) Code for MultiAgentBench : Evaluating the Collaboration and Competition of LLM agents https://www.arxiv.org/pdf/2503.019…
☆133Updated this week
fgranese / DOCTOR
Advances in Neural Information Processing Systems (NeurIPS 2021)
☆22Updated 2 years ago
mitmedialab / medical_hallucination
Medical Hallucination in Foundation Models and Their Impact on Healthcare (2025)
☆63Updated 4 months ago
d-kleine / NER_decoder
Named Entity Recognition with an decoder-only (autoregressive) LLM using HuggingFace
☆43Updated 8 months ago
pranftw / openreview_scraper
Scrape papers from OpenReview using OpenReview API
☆50Updated 4 months ago
allenai / SciRIFF
Dataset and evaluation suite enabling LLM instruction-following for scientific literature understanding.
☆40Updated 4 months ago
IBM / model-recycling
Ranking of fine-tuned HF models as base models.
☆35Updated 2 months ago
hgaurav2k / JEEBench
Repository for the code and dataset for the paper: "Have LLMs Advanced enough? Towards Harder Problem Solving Benchmarks For Large Langu…
☆39Updated last year
jamestszhim / modals
MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space
☆41Updated 4 years ago
dmitrykazhdan / concept-based-xai
Library implementing state-of-the-art Concept-based and Disentanglement Learning methods for Explainable AI
☆55Updated 2 years ago
naver / tldr
TLDR is an unsupervised dimensionality reduction method that combines neighborhood embedding learning with the simplicity and effectivene…
☆125Updated 3 years ago
minnesotanlp / select-llm
Parkar and Kim et al.'s paper on Can LLMs Select Important Instructions to Annotate?"
☆12Updated last year