aastroza / structured-generation-benchmark
Structured Generation Evals
☆12Updated 4 months ago
Alternatives and similar repositories for structured-generation-benchmark:
Users that are interested in structured-generation-benchmark are comparing it to the libraries listed below
- ☆27Updated 4 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆53Updated 5 months ago
- Generate interleaved text and image content in a structured format you can directly pass to downstream APIs.☆26Updated 3 months ago
- ☆56Updated last week
- Training code for Sparse Autoencoders on Embedding models☆35Updated 2 months ago
- Probabilistic programming with HuggingFace language models☆93Updated this week
- ☆80Updated 3 weeks ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆69Updated last month
- Advanced Reasoning Benchmark Dataset for LLMs☆45Updated last year
- ☆48Updated 2 months ago
- Sphynx Hallucination Induction☆51Updated 5 months ago
- ☆19Updated 3 months ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated 10 months ago
- gzip Predicts Data-dependent Scaling Laws☆33Updated 8 months ago
- Small, simple agent task environments for training and evaluation☆18Updated 2 months ago
- A domain-specific probabilistic programming language for modeling and inference with language models☆114Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆74Updated 4 months ago
- ☆21Updated 7 months ago
- ☆55Updated this week
- minimal pytorch implementation of bm25 (with sparse tensors)☆97Updated 10 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆52Updated 11 months ago
- Understanding the correlation between different LLM benchmarks☆29Updated last year
- Benchmark structured generation libraries☆24Updated 3 months ago
- ☆48Updated last year
- ☆31Updated 7 months ago
- ☆20Updated 2 months ago
- Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions☆41Updated 5 months ago
- A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments☆69Updated 3 months ago
- Very minimal (and stateless) agent framework☆41Updated 2 weeks ago
- Experiments for efforts to train a new and improved t5☆77Updated 9 months ago