obalcells / hallucination_probesLinks
Real-Time Detection of Hallucinated Entities in Long-Form Generation
☆275Updated 2 months ago
Alternatives and similar repositories for hallucination_probes
Users that are interested in hallucination_probes are comparing it to the libraries listed below
Sorting:
- Verifiers for LLM Reinforcement Learning☆80Updated 4 months ago
- ⚖️ Awesome LLM Judges ⚖️☆148Updated 8 months ago
- The State Of The Art, intelligence☆157Updated 5 months ago
- Super basic implementation (gist-like) of RLMs with REPL environments.☆435Updated last week
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆256Updated this week
- ☆301Updated 5 months ago
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆459Updated 4 months ago
- ☆68Updated 7 months ago
- ☆87Updated last year
- Simple UI for debugging correlations of text embeddings☆305Updated 7 months ago
- ☆176Updated 10 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆495Updated 4 months ago
- An alignment auditing agent capable of quickly exploring alignment hypothesis☆804Updated last week
- Curated collection of community environments☆204Updated this week
- Inference, Fine Tuning and many more recipes with Gemma family of models☆276Updated 5 months ago
- Evolve your language agent with Agentic Context Engineering (ACE)☆507Updated last month
- General plug-and-play inference library for Recursive Language Models (RLMs), supporting various sandboxes.☆876Updated this week
- Simple examples using Argilla tools to build AI☆57Updated last year
- Together Open Deep Research☆354Updated 9 months ago
- frozen-in-time version of our Paper Finder agent for reproducing evaluation results☆219Updated 4 months ago
- This repository contains the toolkit for replicating results from our technical report.☆192Updated 4 months ago
- ☆92Updated 6 months ago
- Codebase for FinePDFs☆161Updated last week
- The official code implementation for "Cache-to-Cache: Direct Semantic Communication Between Large Language Models"☆314Updated this week
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆87Updated 9 months ago
- ☆79Updated 3 months ago
- Deep research agents using MiniMax M2.1 interleaved thinking☆192Updated 3 weeks ago
- ☆182Updated 11 months ago
- ☆236Updated last month
- Train Large Language Models on MLX.☆240Updated this week