obalcells / hallucination_probesLinks
Real-Time Detection of Hallucinated Entities in Long-Form Generation
☆251Updated 3 weeks ago
Alternatives and similar repositories for hallucination_probes
Users that are interested in hallucination_probes are comparing it to the libraries listed below
Sorting:
- ☆300Updated 2 months ago
- Verifiers for LLM Reinforcement Learning☆75Updated 3 weeks ago
- ☆86Updated last year
- Solving data for LLMs - Create quality synthetic datasets!☆151Updated 8 months ago
- Inference, Fine Tuning and many more recipes with Gemma family of models☆268Updated 2 months ago
- The State Of The Art, intelligence☆152Updated last month
- Code to accompany the Universal Deep Research paper (https://arxiv.org/abs/2509.00244)☆436Updated last month
- ☆232Updated 3 months ago
- II-Researcher: a new open-source framework designed to aid building search / research agents☆473Updated 2 months ago
- frozen-in-time version of our Paper Finder agent for reproducing evaluation results☆178Updated last month
- ☆155Updated 5 months ago
- OSS RL environment + evals toolkit☆184Updated this week
- Together Open Deep Research☆352Updated 5 months ago
- Collection of scripts and notebooks for OpenAI's latest GPT OSS models☆456Updated last month
- ⚖️ Awesome LLM Judges ⚖️☆128Updated 5 months ago
- ☆78Updated 2 weeks ago
- Context Engineering Course with DSPy☆187Updated 2 months ago
- CodeScientist: An automated scientific discovery system for code-based experiments☆294Updated 3 months ago
- ☆89Updated 8 months ago
- Simple examples using Argilla tools to build AI☆55Updated 10 months ago
- An OpenSource Deep Research library with reasoning☆158Updated 3 weeks ago
- Provider-agnostic, open-source evaluation infrastructure for language models☆558Updated this week
- Simple UI for debugging correlations of text embeddings☆292Updated 4 months ago
- An automated tool for discovering insights from research papaer corpora☆139Updated last year
- 🧠 Advanced Claude streaming interface with interleaved thinking, dynamic tool discovery, and MCP integration. Watch Claude think through…☆182Updated 3 months ago
- ☆170Updated 7 months ago
- ScreenSuite - The most comprehensive benchmarking suite for GUI Agents!☆120Updated this week
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆77Updated 6 months ago
- The code repository of the paper: Competition and Attraction Improve Model Fusion☆159Updated last month
- Routing on Random Forest (RoRF)☆211Updated last year