ServiceNow / agent-poirot
☆11Updated 6 months ago
Alternatives and similar repositories for agent-poirot:
Users that are interested in agent-poirot are comparing it to the libraries listed below
- ☆39Updated 2 months ago
- DoomArena is a Framework for Testing AI Agents Against Evolving Security Threats☆19Updated this week
- WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?☆180Updated last week
- PyTorch library for Active Fine-Tuning☆64Updated 2 months ago
- ☆34Updated last year
- Stanford NLP Python library for benchmarking the utility of LLM interpretability methods☆70Updated last month
- Efficient multi-prompt evaluation of LLMs☆19Updated 4 months ago
- Google Research☆46Updated 2 years ago
- ☆87Updated 9 months ago
- CiteME is a benchmark designed to test the abilities of language models in finding papers that are cited in scientific texts.☆43Updated 5 months ago
- Official implementation of the ACL 2024: Scientific Inspiration Machines Optimized for Novelty☆78Updated last year
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆305Updated 5 months ago
- Source code of "Calibrating Large Language Models Using Their Generations Only", ACL2024☆15Updated 5 months ago
- ☆44Updated 5 months ago
- ☆42Updated last year
- ☆72Updated 11 months ago
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆108Updated last year
- LLM finetuning in resource-constrained environments.☆47Updated 10 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆104Updated 4 months ago
- The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions☆69Updated 2 years ago
- [NeurIPS'23] Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors☆75Updated 4 months ago
- Use this package to compute intrinsic dimensionality of your task given a fixed neural network in PYTORCH!☆35Updated 2 years ago
- [NAACL 2025] Towards Rationality in Language and Multimodal Agents: A Survey☆27Updated 2 months ago
- ☆55Updated 2 weeks ago
- ☆140Updated 11 months ago
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆104Updated 6 months ago
- Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".☆158Updated 11 months ago
- ☆145Updated last year
- Data and code for the preprint "In-Context Learning with Long-Context Models: An In-Depth Exploration"☆35Updated 8 months ago
- Code for In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering☆170Updated 2 months ago