anthropic-experimental / agentic-misalignmentLinks
☆89Updated this week
Alternatives and similar repositories for agentic-misalignment
Users that are interested in agentic-misalignment are comparing it to the libraries listed below
Sorting:
- A better way of testing, inspecting, and analyzing AI Agent traces.☆38Updated 3 weeks ago
- Sphynx Hallucination Induction☆54Updated 4 months ago
- auto fine tune of models with synthetic data☆75Updated last year
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆32Updated 3 months ago
- Let Claude control a web browser on your machine.☆30Updated 2 weeks ago
- ☆47Updated last year
- Thoughtful Lightning AI Assistant - Dual-engine system with DeepSeek reasoning and Groq inference, featuring Gradio UI, secure API manage…☆20Updated 4 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆68Updated 3 months ago
- ☆23Updated 7 months ago
- Letting Claude Code develop his own MCP tools :)☆109Updated 3 months ago
- Outputs from the Deep Writer☆16Updated 9 months ago
- Train your own SOTA deductive reasoning model☆94Updated 3 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 4 months ago
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆35Updated last month
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆31Updated 2 months ago
- ☆50Updated 2 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆57Updated 6 months ago
- Challenges for general-purpose web-browsing AI agents☆58Updated 2 weeks ago
- A couple scripts to grab stats from email☆43Updated 9 months ago
- ☆28Updated 6 months ago
- ☆19Updated 4 months ago
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆64Updated 7 months ago
- look how they massacred my boy☆63Updated 8 months ago
- Just a bunch of benchmark logs for different LLMs☆119Updated 10 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆80Updated 3 months ago
- ☆60Updated 2 weeks ago
- ☆62Updated 3 weeks ago
- Vivaria is METR's tool for running evaluations and conducting agent elicitation research.☆94Updated last week
- The next evolution of Agents☆48Updated 3 weeks ago
- Useful resources for LLM-based Diarization and Transcription.☆55Updated 8 months ago