Latest Evaluation Toolkit (LatestEval). Assessing the language models with latest, uncontaminated materials.
☆28Feb 17, 2025Updated last year
Alternatives and similar repositories for LatestEval
Users that are interested in LatestEval are comparing it to the libraries listed below
Sorting:
- Public code release for the paper "Reawakening knowledge: Anticipatory recovery from catastrophic interference via structured training"☆11Oct 27, 2025Updated 4 months ago
- Self-Supervised Alignment with Mutual Information☆20May 24, 2024Updated last year
- Evaluating LLMs with Dynamic Data☆112Feb 11, 2026Updated 3 weeks ago
- (EACL 2021) Discourse-Aware Unsupervised Summarization of Long Scientific Documents☆25Jun 12, 2023Updated 2 years ago
- Explore what LLMs are really leanring over SFT☆28Mar 30, 2024Updated last year
- PyTorch implementation of experiments in the paper Aligning Language Models with Human Preferences via a Bayesian Approach☆32Nov 6, 2023Updated 2 years ago
- PLATO dialog model with pre-trained parameters in pytorch version☆29May 20, 2022Updated 3 years ago
- [ACL 2025, Main Conference, Oral] Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process☆30Aug 2, 2024Updated last year
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"☆75May 20, 2025Updated 9 months ago
- Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models☆50Sep 2, 2025Updated 6 months ago
- Reference implementation for Token-level Direct Preference Optimization(TDPO)☆152Feb 14, 2025Updated last year
- ☆13Dec 5, 2022Updated 3 years ago
- this repository contains the source code for the ACL 2019 paper "Generating Long and Informative Reviews with Aspect-Aware Coarse-to-Fine…☆37Nov 29, 2019Updated 6 years ago
- ViViDex implementation under the SAPIEN simulator, ICRA 2025☆17Apr 9, 2025Updated 11 months ago
- AIrmageddon is a home security AI Agent☆11Aug 30, 2024Updated last year
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues☆13Feb 7, 2024Updated 2 years ago
- test images with not appropriate labels in MNIST dataset☆10Mar 3, 2018Updated 8 years ago
- UFSAC is a resource containing all WordNet Sense Annotated Corpora, and a Java library for manipulating them☆38May 17, 2022Updated 3 years ago
- Code and models for EMNLP 2024 paper "WPO: Enhancing RLHF with Weighted Preference Optimization"☆41Sep 24, 2024Updated last year
- Efficient retrieval head analysis with triton flash attention that supports topK probability☆13Jun 15, 2024Updated last year
- Mini Model Daemon☆12Nov 9, 2024Updated last year
- A simple ChatGPT plugin to manage upcoming AI conferences. Best way to learn ChatGPT plugin development.☆11May 14, 2023Updated 2 years ago
- ☆12Jul 9, 2021Updated 4 years ago
- Chicago Social Interaction Model (chiSIM) framework repository☆12Aug 9, 2023Updated 2 years ago
- python project template for personal projects! 🙋♀️☆11Nov 28, 2020Updated 5 years ago
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- Scripts for KGIRNet model for ESWC☆10Jul 6, 2023Updated 2 years ago
- ☆12May 18, 2023Updated 2 years ago
- The official repository for the paper entitled "Time Travel in LLMs: Tracing Data Contamination in Large Language Models."☆12Jun 11, 2024Updated last year
- Official Repo of "CIBench: Evaluation of LLMs as Code Interpreter "☆14Jul 19, 2024Updated last year
- a Video Quality Analysis Toolkit☆13May 16, 2025Updated 9 months ago
- Accepted to MLSys 2026☆70Mar 2, 2026Updated last week
- A simple method based on RGB-D camera and Google MediaPipe for accurate and lightweight 3D hand tracking☆14Nov 4, 2024Updated last year
- FamilyTool benchmark☆12Sep 10, 2025Updated 5 months ago
- A Convolutional Neural Network For Multi-scale Taxi Trajectory Prediction☆14Dec 13, 2018Updated 7 years ago
- Information Extraction related tools and models☆10Mar 16, 2023Updated 2 years ago
- ☆12Dec 14, 2024Updated last year
- code and dataset of EMNLP 2020 paper "PARADE: A New Dataset for Paraphrase Identification Requiring Computer Science Domain Knowledge"☆12Nov 6, 2020Updated 5 years ago
- ☆15Mar 20, 2025Updated 11 months ago