javyduck / KnowHalu
☆47Updated 10 months ago
Alternatives and similar repositories for KnowHalu:
Users that are interested in KnowHalu are comparing it to the libraries listed below
- Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".☆65Updated 9 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆102Updated 3 months ago
- ☆41Updated 3 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 8 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆104Updated 3 months ago
- The first dense retrieval model that can be prompted like an LM☆68Updated 6 months ago
- ☆81Updated last year
- ☆57Updated 8 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 7 months ago
- LLM reads a paper and produce a working prototype☆51Updated 2 weeks ago
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆82Updated this week
- Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"☆104Updated 6 months ago
- ☆45Updated 6 months ago
- Source code of the paper: RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering [F…☆62Updated 10 months ago
- ☆50Updated 4 months ago
- ☆48Updated 4 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆52Updated 3 months ago
- Evaluating LLMs with CommonGen-Lite☆89Updated last year
- Scalable Meta-Evaluation of LLMs as Evaluators☆42Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆75Updated 6 months ago
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆62Updated 3 months ago
- ☆73Updated 2 months ago
- Explore the use of DSPy for extracting features from PDFs 🔎☆39Updated last year
- MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]☆139Updated 2 months ago
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆25Updated 3 months ago
- Interaction-first method for generating demonstrations for web-agents on any website☆31Updated 3 weeks ago
- Functional Benchmarks and the Reasoning Gap☆84Updated 5 months ago
- Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"☆102Updated 6 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 2 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆67Updated 4 months ago