aastroza / structured-generation-benchmarkLinks
Structured Generation Evals
☆12Updated 11 months ago
Alternatives and similar repositories for structured-generation-benchmark
Users that are interested in structured-generation-benchmark are comparing it to the libraries listed below
Sorting:
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆33Updated 4 months ago
- ☆28Updated 2 months ago
- ☆89Updated 7 months ago
- minimal pytorch implementation of bm25 (with sparse tensors)☆104Updated last year
- Probabilistic LLM evaluations. [CogSci2023; ACL2023]☆73Updated last year
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆94Updated 8 months ago
- ☆68Updated last month
- Commit0: Library Generation from Scratch☆161Updated 3 months ago
- A domain-specific probabilistic programming language for modeling and inference with language models☆133Updated 3 months ago
- An introduction to LLM Sampling☆79Updated 8 months ago
- ☆77Updated last week
- Fast, High-Fidelity LLM Decoding with Regex Constraints☆20Updated last year
- Sphynx Hallucination Induction☆53Updated 6 months ago
- Benchmark structured generation libraries☆29Updated 10 months ago
- ☆53Updated 2 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆65Updated last year
- Small, simple agent task environments for training and evaluation☆18Updated 9 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆60Updated 11 months ago
- Functional Benchmarks and the Reasoning Gap☆88Updated 10 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆88Updated 10 months ago
- Storing long contexts in tiny caches with self-study☆140Updated last week
- Retrieval Augmented Generation Generalized Evaluation Dataset☆55Updated last month
- Extract full next-token probabilities via language model APIs☆247Updated last year
- Train your own SOTA deductive reasoning model☆104Updated 5 months ago
- ☆49Updated 6 months ago
- ☆23Updated 3 months ago
- A repository for transformer critique learning and generation☆90Updated last year
- ☆63Updated 3 weeks ago
- ReLM is a Regular Expression engine for Language Models☆106Updated 2 years ago
- ☆57Updated 11 months ago