Sample notebooks and prompts for LLM evaluation
☆161Nov 2, 2025Updated 4 months ago
Alternatives and similar repositories for LLM-Evaluation
Users that are interested in LLM-Evaluation are comparing it to the libraries listed below
Sorting:
- Vectorized implementation of a general feedforward neural network in Python☆10Jan 22, 2017Updated 9 years ago
- Routing with reinforcement learning☆10Apr 9, 2022Updated 3 years ago
- Study the temporal performance degradation of machine learning models.☆16Jan 26, 2024Updated 2 years ago
- ☆28Updated this week
- Code for our paper: Hierarchical RL Using an Ensemble of Proprioceptive Periodic Policies☆15Feb 21, 2019Updated 7 years ago
- ☆19Jun 26, 2024Updated last year
- Adding NeMo Guardrails to a LlamaIndex RAG pipeline☆41Feb 20, 2024Updated 2 years ago
- Sample app for podcast generation using Azure OpenAI☆22Mar 7, 2025Updated last year
- LLM evaluation.☆16Nov 7, 2023Updated 2 years ago
- Notebooks for RAG optimization workshop, using HackerNews data☆21Mar 27, 2024Updated last year
- Scalable Meta-Evaluation of LLMs as Evaluators☆43Feb 15, 2024Updated 2 years ago
- ☆25Dec 12, 2025Updated 2 months ago
- Code for Zero-shot Triplet Extraction by Template Infilling (Kim et al; IJCNLP-AACL 2023)☆21Feb 17, 2024Updated 2 years ago
- Based on "long-form-factuality" a python based processor to easily fact check anything.☆20Apr 1, 2024Updated last year
- Repository for my LLM notebooks☆29Aug 8, 2024Updated last year
- Build Neo4J Knowledge Graphs from Excel files☆22Nov 18, 2024Updated last year
- 🔍 Attentional interfaces in TensorFlow.☆61Dec 26, 2018Updated 7 years ago
- 🔍 LangKit: An open-source toolkit for monitoring Large Language Models (LLMs). 📚 Extracts signals from prompts & responses, ensuring sa…☆977Nov 22, 2024Updated last year
- Evaluating LLMs with fewer examples☆169Apr 12, 2024Updated last year
- List of Computer Science courses with video lectures.☆27Feb 17, 2022Updated 4 years ago
- ☆28Feb 13, 2026Updated 3 weeks ago
- Codes, scripts, and notebooks on various aspects of transformer models.☆27Feb 27, 2023Updated 3 years ago
- Self Driving Car development tools and technologies from GTA Robotics Community members☆169Sep 20, 2017Updated 8 years ago
- ☆29Apr 29, 2024Updated last year
- OPSTL: Self-supervised Skeleton-based Action Recognition in Occluded Environments☆14Oct 25, 2023Updated 2 years ago
- Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stac…☆255Apr 11, 2025Updated 11 months ago
- We develop benchmarks and analysis tools to evaluate the causal reasoning abilities of LLMs.☆137May 29, 2024Updated last year
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆87Aug 12, 2024Updated last year
- A tool for evaluating LLMs☆428May 10, 2024Updated last year
- 🦖 X—LLM: Cutting Edge & Easy LLM Finetuning☆408Jan 17, 2024Updated 2 years ago
- Sample applications built on the Graphlit Platform☆76Oct 11, 2025Updated 4 months ago
- In this GitHub repository, we will demonstrate how to utilize MongoDB to build an automated underwriting process to calculate a customize…☆11Feb 11, 2026Updated 3 weeks ago
- Repository for samples presented at SDD 2023☆11Nov 24, 2023Updated 2 years ago
- Here, I provided the solution for exercises of IBM Quantum Challenge 2020☆10Oct 27, 2020Updated 5 years ago
- We implement MADDPG in a congestion env, and compare with several control groups to highlight the performance of MADDPG☆11Jul 14, 2021Updated 4 years ago
- First comprehensive survey of NLP work carried out in Senegalese languages covering various tasks + Applications in the social sciences.☆25Updated this week
- A high-throughput and memory-efficient inference and serving engine for LLMs☆12Nov 14, 2025Updated 3 months ago
- ☆17Feb 6, 2025Updated last year
- ☆13Sep 23, 2025Updated 5 months ago