Sample notebooks and prompts for LLM evaluation
☆161Nov 2, 2025Updated 7 months ago
Alternatives and similar repositories for LLM-Evaluation
Users that are interested in LLM-Evaluation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A simple repository showcasing a few LLM Evaluation strategies and leverages W&B Sweeps to optimize the LLM system.☆12Jul 11, 2023Updated 2 years ago
- This repository contains multi-modal speech data for African languages that can be used to train ASR and NLP models☆17Aug 31, 2022Updated 3 years ago
- Perplexity Lite using Langgraph, Tavily, and GPT-4.☆14Jan 11, 2024Updated 2 years ago
- LLM evaluation.☆16Nov 7, 2023Updated 2 years ago
- Study the temporal performance degradation of machine learning models.☆16Jan 26, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Repository for my LLM notebooks☆30Aug 8, 2024Updated last year
- Vectorized implementation of a general feedforward neural network in Python☆10Jan 22, 2017Updated 9 years ago
- Large-language Model Evaluation framework with Elo Leaderboard and A-B testing☆52Oct 24, 2024Updated last year
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆87Aug 12, 2024Updated last year
- ComfyUI node for modular, human‑like Kani TTS. Generate natural, high‑quality speech from text☆38Oct 17, 2025Updated 7 months ago
- ☆19Jun 26, 2024Updated last year
- ☆22Oct 18, 2024Updated last year
- ☆17Apr 24, 2024Updated 2 years ago
- an unofficial Georgia Tech theme for JupyterLab☆10Jun 29, 2021Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆11Feb 25, 2025Updated last year
- ☆11Feb 15, 2022Updated 4 years ago
- A tool for evaluating LLMs☆429Mar 15, 2026Updated 2 months ago
- ☆29Apr 29, 2024Updated 2 years ago
- Just a bunch of benchmark logs for different LLMs☆127Jul 28, 2024Updated last year
- Use Grounding DINO, Segment Anything, and CLIP to label objects in images.☆35Dec 27, 2023Updated 2 years ago
- 🔧 Compare how Agent systems perform on several benchmarks. 📊🚀☆103Aug 4, 2025Updated 10 months ago
- The papers are organized according to our survey: Evaluating Large Language Models: A Comprehensive Survey.☆803May 8, 2024Updated 2 years ago
- Large Scale Benchmark of Large Language Models on African Languages☆19Jul 28, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 🔍 LangKit: An open-source toolkit for monitoring Large Language Models (LLMs). 📚 Extracts signals from prompts & responses, ensuring sa…☆990Nov 22, 2024Updated last year
- Winners of the TissueNet: Detect Lesions in Cervical Biopsies competition☆22Sep 7, 2023Updated 2 years ago
- ☆23Jul 10, 2025Updated 11 months ago
- Domain-Specific Text Generation for Machine Translation (with LLMs) - scripts and config files for the paper☆18Aug 19, 2023Updated 2 years ago
- This project aims to extract ROI like finger tip, Palmprint and Hand-geometry from a single hand image.☆10Aug 24, 2023Updated 2 years ago
- Real time facial emotion recognition☆10May 11, 2021Updated 5 years ago
- ☆15Apr 1, 2020Updated 6 years ago
- ☆28Feb 11, 2026Updated 4 months ago
- Ini kumpulan beberapa materi lab pada Digitalent Schoolarship Python Essentials 2019☆10Mar 27, 2022Updated 4 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆10Jun 29, 2022Updated 3 years ago
- Tutorial on probabilistic classification and cost-sensitive learning.☆13Aug 19, 2025Updated 9 months ago
- Training HuggingFace models using fastai☆11Jul 22, 2021Updated 4 years ago
- First comprehensive survey of NLP work carried out in Senegalese languages covering various tasks + Applications in the social sciences.☆29May 27, 2026Updated 2 weeks ago
- MAFAND-MT☆62Jul 9, 2024Updated last year
- 🦖 X—LLM: Cutting Edge & Easy LLM Finetuning☆408Jan 17, 2024Updated 2 years ago
- [ECCV 2024] M3DBench introduces a comprehensive 3D instruction-following dataset with support for interleaved multi-modal prompts.☆61Oct 1, 2024Updated last year