A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use cases, promote the adoption of best practices in LLM assessment, and critically assess the effectiveness of these evaluation methods.
☆190Apr 30, 2026Updated last month
Alternatives and similar repositories for LLMEvaluation
Users that are interested in LLMEvaluation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Bangla TTS Inference pipeline using Vit TTS☆13Mar 24, 2024Updated 2 years ago
- A framework for few-shot evaluation of autoregressive language models.☆13Feb 14, 2024Updated 2 years ago
- A project for implementing ML and NLP papers☆13May 22, 2020Updated 6 years ago
- [ICML 2025] Official implementation of the paper "SkipGPT: Dynamic Layer Pruning Reinvented with Token Awareness and Module Decoupling". …☆23Nov 17, 2025Updated 6 months ago
- This is code for How Do Social Bots Participate in Misinformation Spread? A Comprehensive Dataset and Analysis☆18Nov 5, 2025Updated 6 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- RACF (Recency Aware Collaborative Filtering) is the implementation of the "Recency Aware Collaborative Filtering for Next Basket Recommen…☆25Feb 15, 2025Updated last year
- Fine-tune Bangla ASR model which was trained Bangla Mozilla Common Voice Dataset☆12Apr 16, 2024Updated 2 years ago
- Awesome Bangla Datasets☆46Mar 29, 2025Updated last year
- ☆17Nov 23, 2023Updated 2 years ago
- ☆13May 30, 2024Updated last year
- Inspect: A framework for large language model evaluations☆2,137Updated this week
- ☆24Dec 12, 2024Updated last year
- Experimental CUDA kernel framework unifying typed dimensions, NVRTC JIT specialization, and ML‑guided tuning.☆46Feb 9, 2026Updated 3 months ago
- Experiments to assess SPADE on different LLM pipelines.☆17Apr 7, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Notebooks demonstrating example applications of the cleanvision library☆17Dec 16, 2025Updated 5 months ago
- Deploy automl models for tabular tasks on AWS Sagemaker with AutoGluon☆13Feb 28, 2020Updated 6 years ago
- My configuration files, loosely inspired by @sontek☆38May 13, 2026Updated 2 weeks ago
- A tool that can be used to measure the sequential performance of any OpenAI-compatible LLM API☆24Aug 1, 2024Updated last year
- KL3M training data collection and preprocessing