liyucheng09/LatestEval

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/liyucheng09/LatestEval)

liyucheng09 / LatestEval

Latest Evaluation Toolkit (LatestEval). Assessing the language models with latest, uncontaminated materials.

☆29

Alternatives and similar repositories for LatestEval

Users that are interested in LatestEval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

liyucheng09 / llm-compressive
View on GitHub
Longitudinal Evaluation of LLMs via Data Compression
☆32May 29, 2024Updated 2 years ago
agentic-learning-ai-lab / anticipatory-recovery
View on GitHub
Public code release for the paper "Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training"
☆11Oct 27, 2025Updated 9 months ago
neukg / KAT-TSLF
View on GitHub
Source code of paper “A Novel Three-Stage Learning Framework for Low-Resource Knowledge-Grounded Dialogue Generation”
☆16Nov 25, 2021Updated 4 years ago
mismayil / crow
View on GitHub
Benchmarking Commonsense Reasoning in Real-World Tasks
☆12Dec 14, 2023Updated 2 years ago
bergen / EdgeTransformer
View on GitHub
☆22Dec 1, 2021Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Jellyfish042 / uncheatable_eval
View on GitHub
Evaluating LLMs with Dynamic Data
☆117Updated this week
amzn / faithful-data2text-cycle-training
View on GitHub
☆11Jul 11, 2023Updated 3 years ago
epfl-dlab / forc
View on GitHub
Framework for Cost-Effective Language Model Choice
☆16Dec 12, 2023Updated 2 years ago
heyunh2015 / PARADE_dataset
View on GitHub
code and dataset of EMNLP 2020 paper "PARADE: A New Dataset for Paraphrase Identification Requiring Computer Science Domain Knowledge"
☆12Nov 6, 2020Updated 5 years ago
mirandrom / HipoRank
View on GitHub
(EACL 2021) Discourse-Aware Unsupervised Summarization of Long Scientific Documents
☆25Jun 12, 2023Updated 3 years ago
StonyBrookNLP / tellmewhy
View on GitHub
Website for release of TellMeWhy dataset for why question answering
☆14Nov 11, 2022Updated 3 years ago
turboLJY / Coarse-to-Fine-Review-Generation
View on GitHub
this repository contains the source code for the ACL 2019 paper "Generating Long and Informative Reviews with Aspect-Aware Coarse-to-Fine…
☆37Nov 29, 2019Updated 6 years ago
microsoft / KID
View on GitHub
Knowledge Infused Decoding
☆70Dec 31, 2023Updated 2 years ago
yinyueqin / relative-preference-optimization
View on GitHub
Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts
☆26Feb 23, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
janphilippfranken / sami
View on GitHub
Self-Supervised Alignment with Mutual Information
☆20May 24, 2024Updated 2 years ago
McGill-NLP / diffusion-itm
View on GitHub
Code and data setup for the paper "Are Diffusion Models Vision-and-language Reasoners?"
☆33Mar 15, 2024Updated 2 years ago
WebNLG / challenge-2020
View on GitHub
Submissions, baselines and evaluations scripts for the 2nd version of the WebNLG+ Challenge 2020
☆13Feb 1, 2022Updated 4 years ago
lyy1994 / awesome-data-contamination
View on GitHub
The Paper List on Data Contamination for Large Language Models Evaluation.
☆117Jun 2, 2026Updated last month
jiamingkong / rwkv_reward
View on GitHub
Training a reward model for RLHF using RWKV.
☆15Jun 5, 2023Updated 3 years ago
Silin159 / PeaCoK
View on GitHub
☆35Jan 7, 2026Updated 6 months ago
Triang-jyed-driung / rwkv7mini
View on GitHub
RWKV-7 mini
☆12Mar 29, 2025Updated last year
Triang-jyed-driung / RWKV-LM-RLHF-DPO
View on GitHub
Direct Preference Optimization for RWKV, aiming for RWKV-5 and 6.
☆11Mar 1, 2024Updated 2 years ago
LLaMafia / SFT_function_learning
View on GitHub
Explore what LLMs are really leanring over SFT
☆28Mar 30, 2024Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
SmerkyG / GoldFinch-paper
View on GitHub
GoldFinch and other hybrid transformer components
☆16Dec 9, 2025Updated 7 months ago
Repast / chiSIM
View on GitHub
Chicago Social Interaction Model (chiSIM) framework repository
☆12Aug 9, 2023Updated 2 years ago
GATECH-EIC / LLM4HWDesign_Starting_Toolkit
View on GitHub
LLM4HWDesign Starting Toolkit
☆20Oct 4, 2024Updated last year
wangjs9 / Aligned-dPM
View on GitHub
PyTorch implementation of experiments in the paper Aligning Language Models with Human Preferences via a Bayesian Approach
☆32Nov 6, 2023Updated 2 years ago
yinzhangyue / SelfAware
View on GitHub
Do Large Language Models Know What They Don’t Know?
☆103Nov 8, 2024Updated last year
chujiezheng / LLM-Extrapolation
View on GitHub
Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"
☆75May 20, 2025Updated last year
algoprog / SynTOD
View on GitHub
Synthetic data generation for TODs
☆23Jul 17, 2024Updated 2 years ago
tuhinjubcse / MetaphorGenNAACL2021
View on GitHub
Code for MERMAID : Metaphor Generation with Symbolism and Discriminative Decoding
☆11May 2, 2022Updated 4 years ago
THU-KEG / Xlore2.0
View on GitHub
Xlore2.0 Code[BaiduExtractor, HudongExtractor, WikiExtractor, XloreData, XloreWeb]
☆12Apr 5, 2017Updated 9 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
telepathylabsai / prompt-based-user-simulator
View on GitHub
In-Context Learning User Simulators for Task-Oriented Dialog Systems
☆29Jun 2, 2023Updated 3 years ago
RWKV-Wiki / rwkv-wiki.github.io
View on GitHub
RWKV Wiki website (archived, please visit official wiki)
☆10Mar 26, 2023Updated 3 years ago
microsoft / experiential_rl
View on GitHub
The official codebase for "Experiential Reinforcement Learning" - https://arxiv.org/pdf/2602.13949v1
☆76Jul 2, 2026Updated 3 weeks ago
HwwAncient / Pytorch-PLATO
View on GitHub
PLATO dialog model with pre-trained parameters in pytorch version
☆29May 20, 2022Updated 4 years ago
smellslikeml / distributed-deep-learning-workshop
View on GitHub
☆13Dec 5, 2022Updated 3 years ago
acl-org / emnlp-2023
View on GitHub
Repository containing the website for the EMNLP 2023 conference
☆17Feb 12, 2025Updated last year
xaiguy / chippy
View on GitHub
☆13Feb 26, 2023Updated 3 years ago