Comprehensive LLM evaluation at scale: A production-ready framework for evaluating large language models across multiple benchmarks.
☆41Jun 26, 2026Updated this week
Alternatives and similar repositories for eval-framework
Users that are interested in eval-framework are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A landing page with all info around Polkadot Technical Fellowship☆16Apr 27, 2026Updated 2 months ago
- Code for "Mind Your Inflections! Improving NLP for Non-Standard Englishes with Base-Inflection Encoding" (EMNLP 2020).☆11May 1, 2025Updated last year
- Code for the paper "Greed is All You Need: An Evaluation of Tokenizer Inference Methods"☆13Nov 26, 2024Updated last year
- Implementation for the paper "Unified Multimodal Model with Unlikelihood Training for Visual Dialog"☆13May 12, 2023Updated 3 years ago
- Can Large Language Models Identify Authorship? (EMNLP 2024 Findings)☆13Feb 4, 2025Updated last year
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Source code for paper "VD-PCR: Improving Visual Dialog with Pronoun Coreference Resolution"☆10Nov 1, 2022Updated 3 years ago
- Split bib files for anthology bibliography for overleaf☆11Aug 25, 2024Updated last year
- ACL 2021 paper "Style is NOT a single variable: Case Studies for Cross-Style Language Understanding " by Dongyeop Kang and Eduard Hovy☆15Jul 19, 2021Updated 4 years ago
- DSTC8-AVSD: Sentence generation task for Audio Visual Scene-aware Dialog☆14Jun 10, 2021Updated 5 years ago
- ☆30May 6, 2026Updated last month
- The account/rank pairs which the Technical Committee should introduce.☆39Feb 14, 2023Updated 3 years ago
- CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations☆30Oct 27, 2023Updated 2 years ago
- [NAACL 2025] Beyond End-to-End VLMs: Leveraging Intermediate Text Representations for Superior Flowchart Understanding☆21Aug 23, 2025Updated 10 months ago
- This is a Utrecht University dissertation template for LaTeX☆22Jul 31, 2025Updated 10 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Experiments on including metadata such as URLs, timestamps, website descriptions and HTML tags during pretraining.☆30Jun 12, 2023Updated 3 years ago
- [ACL 2025 Main] Official Repo for Paper "Measuring Data Diversity for Instruction Tuning: A Systematic Analysis and A Reliable Metric"☆41Feb 10, 2026Updated 4 months ago
- A Collection of Pydantic Models to Abstract IRL☆41Dec 10, 2025Updated 6 months ago
- ☆27May 4, 2020Updated 6 years ago
- Repository (preliminary codes) for DSTC10 SIMMC track.☆19Dec 9, 2022Updated 3 years ago
- [EMNLP 2020] Collective HumAn OpinionS on Natural Language Inference Data☆42Apr 7, 2022Updated 4 years ago
- Official Repository of NeurIPS2021 paper: PTR☆32Dec 17, 2021Updated 4 years ago
- ☆30Oct 20, 2021Updated 4 years ago
- ☆42Oct 3, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Visual Dialog: Light-weight Transformer for Many Inputs (ECCV 2020)☆29Aug 5, 2021Updated 4 years ago
- An Educational Framework Based on PyTorch for Deep Learning Education and Exploration☆11Dec 24, 2023Updated 2 years ago
- A RAG that can scale 🧑🏻💻☆11May 28, 2024Updated 2 years ago
- Traffic Assignment frameworK (TAsK). Traffic assignment algorithms for the conventional and non-additive traffic assignment problems.☆47Feb 11, 2018Updated 8 years ago
- PyTorch code for Reasoning Visual Dialogs with Structural and Partial Observations☆42Jun 30, 2021Updated 4 years ago
- Sequence Modeling with Structured State Spaces☆68Aug 2, 2022Updated 3 years ago
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆71Jun 15, 2026Updated 2 weeks ago
- An intensive academic program teaching Blockchain, Substrate, and Polkadot.☆93May 27, 2026Updated last month
- Karaoke Editor☆61Jan 9, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- PyTorch implementation of the paper "Dialogue Act Classification with Context-Aware Self-Attention" for dialogue act classification with …☆45Aug 1, 2023Updated 2 years ago
- Repository to generate CLEVR-Dialog: A diagnostic dataset for Visual Dialog☆50Feb 18, 2020Updated 6 years ago
- Explorations into the proposed SDFT, Self-Distillation Enables Continual Learning, from Shenfeld et al. of MIT☆32Feb 6, 2026Updated 4 months ago
- A Model Agnostic function to directly remove specified layers from the LLM☆10May 23, 2024Updated 2 years ago
- Code for the paper Non-Autoregressive Dialog State Tracking (ICLR20)☆44Feb 25, 2020Updated 6 years ago
- A list where most values will be None (or default)☆11Jun 22, 2026Updated last week
- 🤖📚 Telegram bot to convert and email PDFs, EPUBs or MOBIs to your Kindle☆11Sep 16, 2022Updated 3 years ago