DigitalHarborFoundation / FlexEvalLinks
☆12Updated last week
Alternatives and similar repositories for FlexEval
Users that are interested in FlexEval are comparing it to the libraries listed below
Sorting:
- ☆33Updated 2 years ago
- Edu-ConvoKit: An Open-Source Framework for Education Conversation Data☆95Updated 2 months ago
- Code and data for the paper "Measuring Conversational Uptake: A Case-Study on Student-Teacher Interactions"☆24Updated 2 months ago
- ☆53Updated last year
- NAACL 2024. Code & Dataset for "🌁 Bridging the Novice-Expert Gap via Models of Decision-Making: A Case Study on Remediating Math Mistake…☆41Updated 11 months ago
- ☆22Updated 3 years ago
- Course repository for the session "Hands-on Transformers: Fine-Tune your own BERT and GPT" of the Data Science Summer School 2023☆87Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆130Updated last year
- ☆13Updated 3 years ago
- Dataset used to evaluate Skill Extraction systems based on the ESCO skills taxonomy.☆13Updated 11 months ago
- A repository with several curated datasets of counter-narratives to fight online hate speech.☆89Updated 2 years ago
- Repository for Zheng and Guha et al., 2021, "When Does Pretraining Help? Assessing Self-Supervised Learning for Law and the CaseHOLD Data…☆90Updated 2 years ago
- An Evaluation Taxonomy for Pedagogical Ability Assessment of LLM-Powered AI Tutors☆12Updated 2 weeks ago
- This repository hosts the paper “LLM Based Math Tutoring: Challenges and Dataset”, along with the accompanying dataset. It explores the p…☆47Updated 9 months ago
- This repository contains a dataset containing ≈2K dialogues whose listener utterances are annotated from labels derived from the Motiva-…☆17Updated 2 years ago
- Package to extract connotation frames☆85Updated last year
- ☆95Updated last year
- ☆43Updated 2 years ago
- Can Large Language Models Be an Alternative to Human Evaluations?☆9Updated last year
- [ICLR 2024 & NeurIPS 2023 WS] An Evaluator LM that is open-source, offers reproducible evaluation, and inexpensive to use. Specifically d…☆299Updated last year
- ☆69Updated last year
- A curated list of awesome datasets with human label variation (un-aggregated labels) in Natural Language Processing and Computer Vision, …☆85Updated last year
- ☆24Updated 2 years ago
- Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"☆84Updated 10 months ago
- This repository contains the two datasets introduced in the paper "Making Science Simple: Corpora for the Lay Summarisation of Scientific…☆25Updated last year
- Bayesian IRT models in Python☆142Updated this week
- Clustering sentence embeddings to extract message intent☆174Updated 3 years ago
- A package to run embedded topic modelling with ETM. Adapted from the original at: https://github.com/adjidieng/ETM☆95Updated last year
- This repository contains materials for the SIGIR 2022 tutorial on opinion summarization.☆34Updated 2 years ago
- ☆106Updated last year