This repository stems from our paper, “Cataloguing LLM Evaluations”, and serves as a living, collaborative catalogue of LLM evaluation frameworks, benchmarks and papers.
☆20Nov 16, 2023Updated 2 years ago
Alternatives and similar repositories for LLM-Evals-Catalogue
Users that are interested in LLM-Evals-Catalogue are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for Preventing Language Models From Hiding Their Reasoning, which evaluates defenses against LLM steganography.☆25Jan 26, 2024Updated 2 years ago
- Developer documentation for EMF APIs☆14Apr 6, 2026Updated last week
- ☆19Sep 27, 2024Updated last year
- Hubcap is an autonomous AI agent in 25 lines of code: a small Autobot that you can't trust. *This is the Python fork/port* from https://g…☆22Nov 10, 2025Updated 5 months ago
- I know Kung Fu☆24Mar 27, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆18Mar 28, 2026Updated 3 weeks ago
- ☆15Nov 25, 2025Updated 4 months ago
- Comparison of Metaflow, MLFlow and DVC☆14Aug 4, 2021Updated 4 years ago
- [SDM22] PyTorch implementation for "Neural Graph Matching for Pre-training Graph Neural Networks".☆17Apr 5, 2022Updated 4 years ago
- [EMNLP 2022] Code for our paper “ZeroGen: Efficient Zero-shot Learning via Dataset Generation”.☆47Feb 18, 2022Updated 4 years ago
- mySociety code common to several projects☆24Mar 25, 2026Updated 3 weeks ago
- A meta package for the PHP Quality Assurance Toolchain tools required by the Template for Jenkins Jobs for PHP Projects, http://jenkins-p…☆35Mar 20, 2015Updated 11 years ago
- Preprint/draft article/blog on some explainable machine learning misconceptions. WIP!☆29Jul 13, 2019Updated 6 years ago
- ☆10Oct 2, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A directory of practical and usable AI agents resources from applications and platforms to frameworks and utilities and other parts of th…☆33Mar 28, 2026Updated 3 weeks ago
- Resources for exploring Generative Feedback Loops with Weaviate!☆39Apr 22, 2025Updated 11 months ago
- A mobile Implementation of llama.cpp☆26Oct 11, 2023Updated 2 years ago
- WhatsApp chatbot with Dialogflow and Twilio api☆10May 6, 2024Updated last year
- rOpenSci's San Francisco hackathon/unconf 2016☆23Mar 21, 2016Updated 10 years ago
- https://icml.cc/virtual/2023/poster/24354☆10Aug 15, 2023Updated 2 years ago
- Code for paper "AnswerQuest: A System for Generating Question-Answer Items from Multi-Paragraph Documents"☆19Jun 12, 2023Updated 2 years ago
- Sample notebooks and prompts for LLM evaluation☆160Nov 2, 2025Updated 5 months ago
- ☆18Jul 20, 2025Updated 8 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A public transit router for GTFS feeds (currently only static) written in modern c++☆24Mar 24, 2023Updated 3 years ago
- CLUE: A Clinical Language Understanding Evaluation for LLMs☆20Jan 22, 2025Updated last year
- ☆15May 23, 2024Updated last year
- ☆11Feb 3, 2025Updated last year
- Automatic prompt caching for Claude Code. Cuts token costs by up to 90% on repeated file reads, bug fix sessions, and long coding convers…☆96Apr 10, 2026Updated last week
- This project is an AI Recruitment System designed to accelerate the hiring process for HR and technical recruiters.☆15Jan 3, 2025Updated last year
- Practical ideas on securing machine learning models☆37May 27, 2021Updated 4 years ago
- [WWW demo 2026] TrustResearcher: Automating Knowledge-Grounded and Transparent Research Ideation with Multi-Agent Collaboration☆46Feb 26, 2026Updated last month
- ☆21Mar 8, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Abstractive text summarization http://arxiv.org/abs/1509.00685☆24Mar 18, 2016Updated 10 years ago
- [EMNLP 2022] Distillation-Resistant Watermarking (DRW) for Model Protection in NLP☆13Aug 17, 2023Updated 2 years ago
- Labs for the "Build an agentic LLM assistant on AWS" workshop. A step by step agentic llm assistant development workshop using serverless…☆78Oct 14, 2025Updated 6 months ago
- Code of our paper "Method-Level Bug Severity Prediction using Source Code Metrics and LLMs" which is accepted to ISSRE 2023.☆10Nov 12, 2023Updated 2 years ago
- Pre-trained Online Contrastive Learning for Insurance Fraud Detection☆12Jul 12, 2024Updated last year
- Completions, code snippets helping you to get even more out of the amazing Fish shell☆45Sep 18, 2019Updated 6 years ago
- Common AI Agent written with Go. Supports MCP, RAG, A2A, AI Memory☆41Feb 9, 2026Updated 2 months ago