aastroza / structured-generation-benchmarkLinks
Structured Generation Evals
☆12Updated 8 months ago
Alternatives and similar repositories for structured-generation-benchmark
Users that are interested in structured-generation-benchmark are comparing it to the libraries listed below
Sorting:
- ☆28Updated 8 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆32Updated last month
- Training code for Sparse Autoencoders on Embedding models☆38Updated 3 months ago
- A guide to structured generation using constrained decoding☆11Updated 11 months ago
- Small, simple agent task environments for training and evaluation☆18Updated 7 months ago
- Understanding the correlation between different LLM benchmarks☆29Updated last year
- Probabilistic LLM evaluations. [CogSci2023; ACL2023]☆73Updated 10 months ago
- Advanced Reasoning Benchmark Dataset for LLMs☆46Updated last year
- ☆57Updated 3 weeks ago