liyucheng09 / LatestEvalView external linksLinks
Latest Evaluation Toolkit (LatestEval). Assessing the language models with latest, uncontaminated materials.
☆28Feb 17, 2025Updated 11 months ago
Alternatives and similar repositories for LatestEval
Users that are interested in LatestEval are comparing it to the libraries listed below
Sorting:
- Longitudinal Evaluation of LLMs via Data Compression☆33May 29, 2024Updated last year
- Public code release for the paper "Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training"☆11Oct 27, 2025Updated 3 months ago
- Source code of paper “A Novel Three-Stage Learning Framework for Low-Resource Knowledge-Grounded Dialogue Generation”☆16Nov 25, 2021Updated 4 years ago
- Self-Supervised Alignment with Mutual Information☆20May 24, 2024Updated last year
- Relative Preference Optimization: Enhancing LLM Alignment through Contrasting Responses across Identical and Diverse Prompts☆25Feb 23, 2024Updated last year
- Evaluating LLMs with Dynamic Data☆111Updated this week
- Explore what LLMs are really leanring over SFT☆28Mar 30, 2024Updated last year
- PyTorch implementation of experiments in the paper Aligning Language Models with Human Preferences via a Bayesian Approach☆32Nov 6, 2023Updated 2 years ago
- PLATO dialog model with pre-trained parameters in pytorch version☆29May 20, 2022Updated 3 years ago
- [ACL 2025, Main Conference, Oral] Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process☆30Aug 2, 2024Updated last year
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"☆75May 20, 2025Updated 8 months ago
- Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models☆50Sep 2, 2025Updated 5 months ago
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆151Feb 14, 2025Updated last year
- test images with not appropriate labels in MNIST dataset☆10Mar 3, 2018Updated 7 years ago
- this repository contains the source code for the ACL 2019 paper "Generating Long and Informative Reviews with Aspect-Aware Coarse-to-Fine…☆37Nov 29, 2019Updated 6 years ago
- Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"☆41Sep 24, 2024Updated last year
- ☆12Dec 14, 2024Updated last year
- Code for MERMAID : Metaphor Generation with Symbolism and Discriminative Decoding☆11May 2, 2022Updated 3 years ago
- https://avocado-captioner.github.io/☆29Oct 16, 2025Updated 4 months ago
- ☆12Jul 9, 2021Updated 4 years ago
- A simple ChatGPT plugin to manage upcoming AI conferences. Best way to learn ChatGPT plugin development.☆11May 14, 2023Updated 2 years ago
- Official Repo of "CIBench: Evaluation of LLMs as Code Interpreter "☆14Jul 19, 2024Updated last year
- EasyRLHF aims to provide an easy and minimal interface to train aligned language models, using off-the-shelf solutions and datasets☆10Dec 12, 2023Updated 2 years ago
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- Website for release of TellMeWhy dataset for why question answering☆14Nov 11, 2022Updated 3 years ago
- Information Extraction related tools and models☆10Mar 16, 2023Updated 2 years ago
- A Convolutional Neural Network For Multi-scale Taxi Trajectory Prediction☆14Dec 13, 2018Updated 7 years ago
- python project template for personal projects! 🙋♀️☆11Nov 28, 2020Updated 5 years ago
- A python tool help to interact with chatgpt.☆10Dec 11, 2022Updated 3 years ago
- Sound Separation, Omni modal☆28Sep 15, 2025Updated 5 months ago
- ☆20Jan 18, 2019Updated 7 years ago
- Mini Model Daemon☆12Nov 9, 2024Updated last year
- Efficient retrieval head analysis with triton flash attention that supports topK probability☆13Jun 15, 2024Updated last year
- The official repository for the paper entitled "Time Travel in LLMs: Tracing Data Contamination in Large Language Models."☆12Jun 11, 2024Updated last year
- Accepted to MLSys 2026☆70Jan 29, 2026Updated 2 weeks ago
- ☆15Mar 20, 2025Updated 10 months ago
- FamilyTool benchmark☆12Sep 10, 2025Updated 5 months ago
- Scripts for KGIRNet model for ESWC☆10Jul 6, 2023Updated 2 years ago
- Do Large Language Models Know What They Don’t Know?☆102Nov 8, 2024Updated last year