isle-dev/MetricEval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/isle-dev/MetricEval)

isle-dev / MetricEval

MetricEval: A framework that conceptualizes and operationalizes four main components of metric evaluation, in terms of reliability and validity

☆12

Alternatives and similar repositories for MetricEval

Users that are interested in MetricEval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

alohays / openai-tool2mcp
View on GitHub
mcp wrapper for openai built-in tools
☆12Mar 13, 2025Updated last year
zhiao777774 / awesome-personalized-lm
View on GitHub
A curated list of personalized Language model / Large language model (continually updated)
☆10Nov 17, 2023Updated 2 years ago
eagle705 / awesome-nlp-note
View on GitHub
A curated list of resources dedicated to NLP (paper, blogs, note and etc)
☆13Nov 30, 2019Updated 6 years ago
FacerAin / facerain.github.io
View on GitHub
☆10Feb 16, 2025Updated last year
kakao / diatool-dpo
View on GitHub
☆15Aug 25, 2025Updated 10 months ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
awslabs / durepa-hybrid-qa
View on GitHub
☆12Mar 22, 2024Updated 2 years ago
ros-infrastructure / rosdoc_lite
View on GitHub
A light-weight version of rosdoc that does not rely on ROS infrastructure for crawling packages.
☆10Apr 16, 2024Updated 2 years ago
e-apostolidis / PoR-Summarization-Measure
View on GitHub
A python implementation for computing the PoR metric for video summarization from "Performance over Random: A Robust Evaluation Protocol …
☆10May 4, 2022Updated 4 years ago
stanis-morozov / self-supervised-gan-eval
View on GitHub
☆13Mar 30, 2021Updated 5 years ago
mybirth0407 / snu-stat-bigdata-computing
View on GitHub
bigdata-bootcamp for graduate students in statistics at Seoul National University
☆26Aug 26, 2025Updated 10 months ago
bbuing9 / DND
View on GitHub
Code for the paper "What Makes Better Augmentation Strategies? Augment Difficult but Not too Different" (ICLR 22)
☆12Aug 28, 2023Updated 2 years ago
sunlab-osu / IterPrompt
View on GitHub
☆19Nov 7, 2022Updated 3 years ago
hsajjad / ConceptX
View on GitHub
Analyzing Latent Concept in Pre-trained Transformer Models
☆12Jul 18, 2022Updated 4 years ago
LeonEricsson / llmjudge
View on GitHub
Exploring limitations of LLM-as-a-judge
☆20Aug 17, 2024Updated last year
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
proboscis / pinjected
View on GitHub
☆16Nov 4, 2025Updated 8 months ago
Yijia-Xiao / Know2BIO
View on GitHub
Know2BIO: A Comprehensive Dual-View Benchmark for Evolving Biomedical Knowledge Graphs
☆15Feb 10, 2026Updated 5 months ago
LG-AI-EXAONE / KMMLU-Pro
View on GitHub
☆16Aug 18, 2025Updated 11 months ago
riktor / KGPL
View on GitHub
A tensorflow implementation of KGPL
☆11Jan 1, 2021Updated 5 years ago
gigio1023 / minecraft-llm-agent-community
View on GitHub
A Soul-grounded Minecraft social simulation runtime where Mineflayer actors pursue LifeGoals through evidence-backed action skills and tr…
☆24Updated this week
gmftbyGMFTBY / Rep-Dropout
View on GitHub
[NeurIPS 2023] Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective
☆41Oct 17, 2023Updated 2 years ago
chongyangtao / LLMs-for-NLG-Evaluation
View on GitHub
Awesome LLM for NLG Evaluation Papers
☆26Jan 23, 2024Updated 2 years ago
lisa-wm / entropybaseduq
View on GitHub
☆12Apr 4, 2025Updated last year
bhargaviparanjape / explainable_qa
View on GitHub
Implementation for https://arxiv.org/abs/2005.00652
☆27Dec 8, 2022Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
julian-risch / toxic-comment-collection
View on GitHub
Code for our WOAH@ACL 2021 Paper on Data Integration for Toxic Comment Classification: Making More Than 40 Datasets Easily Accessible in …
☆30Nov 25, 2021Updated 4 years ago
RUCAIBox / RLMEC
View on GitHub
The official repository of "Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint"
☆39Jan 12, 2024Updated 2 years ago
Sanaxen / cpp_torch
View on GitHub
It is tiny-dnn based on libtorch. Only headers without dependencies other than libtorch, deep learning framework
☆37Nov 21, 2024Updated last year
Roshanpaswan / tkPDFViewer
View on GitHub
The tkPDFViewer is python library developed by Roshan Paswan, which allows you to embed the PDF file in your tkinter GUI.
☆13Dec 4, 2022Updated 3 years ago
MusicTextSynaesthesia / MusicTextSynaesthesia
View on GitHub
☆10Sep 17, 2022Updated 3 years ago
xhan77 / veiled-toxicity-detection
View on GitHub
Fortifying Toxic Speech Detectors Against Veiled Toxicity
☆11Oct 21, 2020Updated 5 years ago
Tiiiger / templm
View on GitHub
Code release for "TempLM: Distilling Language Models into Template-Based Generators"
☆14Jul 21, 2022Updated 4 years ago
ASGuard-UCI / ld-metric
View on GitHub
Code for the paper entitled "Towards Driving-Oriented Metric for Lane Detection Models" (CVPR 2022)
☆25Mar 19, 2022Updated 4 years ago
chenzen94 / debug-deepspeed-chat
View on GitHub
Debug DeepSpeed-Chat step by step in IDE (在IDE里一步一步调试DeepSpeed-Chat)
☆10Apr 17, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
mayu-ot / hidden-challenges-MR
View on GitHub
codes for Uncovering Hidden Challenges in Query-Based Video Moment Retrieval
☆20Sep 7, 2020Updated 5 years ago
personalrobotics / collaborative_manipulation_corpus
View on GitHub
A Corpus of Natural Language Instructions for Collaborative Manipulation
☆15Feb 15, 2017Updated 9 years ago
Yale-LILY / ROSE
View on GitHub
☆41Jun 7, 2023Updated 3 years ago
joeljang / knowledge-unlearning
View on GitHub
[ACL 2023] Knowledge Unlearning for Mitigating Privacy Risks in Language Models
☆89Sep 12, 2024Updated last year
noagarcia / ROLL-VideoQA
View on GitHub
PyTorch code for ROLL, a knowledge-based video story question answering model.
☆21Sep 29, 2020Updated 5 years ago
HPC-Fortran2CPP / Fortran2Cpp
View on GitHub
Fortran2Cpp: A new model designed for the Code translation between the Fortran and C++
☆17Mar 28, 2025Updated last year
noagarcia / ArtVQA
View on GitHub
AQUA dataset and VIKING model for the task of Art Visual Question Answering
☆27Jun 4, 2021Updated 5 years ago