aastroza / structured-generation-benchmarkLinks
Structured Generation Evals
☆14Updated last year
Alternatives and similar repositories for structured-generation-benchmark
Users that are interested in structured-generation-benchmark are comparing it to the libraries listed below
Sorting:
- minimal pytorch implementation of bm25 (with sparse tensors)☆104Updated 3 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆35Updated 9 months ago
- ☆29Updated 3 months ago
- PyTorch implementation for MRL☆21Updated last year
- An introduction to LLM Sampling☆79Updated last year
- Mixing Language Models with Self-Verification and Meta-Verification☆112Updated last year
- Probabilistic LLM evaluations. [CogSci2023; ACL2023]☆73Updated last year
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.☆27Updated 2 years ago
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆82Updated last year
- ☆105Updated last year
- Benchmark structured generation libraries☆30Updated last year
- Advanced Reasoning Benchmark Dataset for LLMs☆47Updated 2 years ago
- Extract full next-token probabilities via language model APIs☆248Updated last year
- ☆56Updated last year
- ☆59Updated last year
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆150Updated last month
- Using open source LLMs to build synthetic datasets for direct preference optimization☆72Updated last year
- Small, simple agent task environments for training and evaluation☆19Updated last year
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆66Updated 2 months ago
- Just a bunch of benchmark logs for different LLMs☆119Updated last year
- Functional Benchmarks and the Reasoning Gap☆89Updated last year
- Fast, High-Fidelity LLM Decoding with Regex Constraints☆21Updated last year
- ☆45Updated 2 years ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆102Updated last year
- Chat Markup Language conversation library☆55Updated 2 years ago
- Supercharge huggingface transformers with model parallelism.☆78Updated 6 months ago
- ☆53Updated last year
- Experiments for efforts to train a new and improved t5☆76Updated last year
- PyLate efficient inference engine☆71Updated last month
- Lightweight tools for quick and easy LLM demo's☆28Updated last year