kimyuji/EvolvingQA_benchmark

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kimyuji/EvolvingQA_benchmark)

kimyuji / EvolvingQA_benchmark

Code and Dataset release of "Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models" (NAACL 2024)

☆10

Alternatives and similar repositories for EvolvingQA_benchmark

Users that are interested in EvolvingQA_benchmark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

joeljang / continual-knowledge-learning
View on GitHub
[ICLR 2022] Towards Continual Knowledge Learning of Language Models
☆91Oct 11, 2022Updated 3 years ago
MattYoon / reasoning-models-confidence
View on GitHub
[NeurIPS 2025] Reasoning Models Better Express Their Confidence"
☆23Nov 19, 2025Updated 8 months ago
AIRC-KETI / long-ke-t5
View on GitHub
☆13Jul 31, 2023Updated 2 years ago
joeljang / temporalwiki
View on GitHub
[EMNLP 2022] TemporalWiki: A Lifelong Benchmark for Training and Evaluating Ever-Evolving Language Models
☆75May 15, 2024Updated 2 years ago
xieyxclack / factual_coco
View on GitHub
The implementation of <Factual Consistency Evaluation for Text Summarization via Counterfactual Estimation> in PyTorch.
☆17Nov 11, 2021Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
hunkim / ACL-2020-Papers
View on GitHub
Statistics and Accepted paper list of ACL 2020 with arXiv link
☆23May 30, 2020Updated 6 years ago
agwaBom / AsmDepictor
View on GitHub
Official implementation of AsmDepictor, "A Transformer-based Function Symbol Name Inference Model from an Assembly Language for Binary Re…
☆29Apr 30, 2024Updated 2 years ago
sungnyun / openssl-simcore
View on GitHub
(CVPR 2023) Coreset Sampling from Open-Set for Fine-Grained Self-Supervised Learning
☆30Oct 3, 2023Updated 2 years ago
yizhongw / llm-temporal-alignment
View on GitHub
Methods and evaluation for aligning language models temporally
☆31Mar 2, 2024Updated 2 years ago
joeljang / FLM
View on GitHub
All-in-one repository for Fine-tuning & Pretraining (Large) Language Models
☆15Mar 8, 2023Updated 3 years ago
r-three / realistic_evaluation_of_model_merging_for_compositional_generalization
View on GitHub
☆13Feb 11, 2026Updated 5 months ago
LouisDo2108 / MediaEval2022-TailAwareSpermDetection
View on GitHub
"Tail-Aware Sperm Analysis for Transparent Tracking of Spermatozoa" Official Implementation
☆10Jan 21, 2026Updated 6 months ago
LostCow / KLUE
View on GitHub
KLUE Benchmark 1st place (2021.12) solutions. (RE, MRC, NLI, STS, TC)
☆25Apr 11, 2022Updated 4 years ago
euiin / SMART
View on GitHub
SMART introduces a novel test-time framework where Small Language Models (SLMs) reason step-by-step, and Large Language Models (LLMs) pro…
☆12Jul 9, 2025Updated last year
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
prometheus-eval / scaling-evaluation-compute
View on GitHub
Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"
☆12Mar 25, 2025Updated last year
martell / pthreads-win32.cmake
View on GitHub
☆11Nov 4, 2012Updated 13 years ago
jongwooko / NASH-Pruning-Official
View on GitHub
Code Implementation for "NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models" (EMNLP …
☆17Oct 17, 2023Updated 2 years ago
llyx97 / Rosita
View on GitHub
[AAAI 2021] "ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques", Yuanxin Liu, Zheng Lin, Fengcheng Yuan
☆14Oct 18, 2022Updated 3 years ago
hbin0701 / Self-Explore
View on GitHub
[𝐄𝐌𝐍𝐋𝐏 𝐅𝐢𝐧𝐝𝐢𝐧𝐠𝐬 𝟐𝟎𝟐𝟒 & 𝐀𝐂𝐋 𝟐𝟎𝟐𝟒 𝐍𝐋𝐑𝐒𝐄 𝐎𝐫𝐚𝐥] 𝘌𝘯𝘩𝘢𝘯𝘤𝘪𝘯𝘨 𝘔𝘢𝘵𝘩𝘦𝘮𝘢𝘵𝘪𝘤𝘢𝘭 𝘙𝘦𝘢𝘴𝘰𝘯𝘪𝘯…
☆52May 4, 2024Updated 2 years ago
varshakishore / IncDSI
View on GitHub
☆11Sep 10, 2023Updated 2 years ago
SprocketLab / Alchemist
View on GitHub
☆12Mar 4, 2025Updated last year
Noahs-ARK / RFA
View on GitHub
☆33Apr 12, 2021Updated 5 years ago
moqingyan / dsr-lm
View on GitHub
☆13Jul 8, 2023Updated 3 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
uchicago-cs / plrg
View on GitHub
PL Reading Group Website
☆15Jan 12, 2026Updated 6 months ago
cavedweller509 / SentenceVAE
View on GitHub
Enable Next-sentence Prediction for Large Language Models with Faster Speed, Higher Accuracy and Longer Context
☆42Aug 16, 2024Updated last year
CHLee0801 / TemporalWikiDatasets
View on GitHub
☆13Apr 24, 2022Updated 4 years ago
circle-hit / Lens
View on GitHub
Code for our paper titled "Lens: Rethinking Multilingual Enhancement for Large Language Models"
☆12Oct 15, 2024Updated last year
AIRC-KETI / ke-t5-downstreams
View on GitHub
☆39Mar 25, 2024Updated 2 years ago
KID-22 / Cocktail
View on GitHub
Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration
☆15Jun 4, 2024Updated 2 years ago
pnuailab / parser
View on GitHub
한국어 문장 분석 시스템 BCD-KL-Parser
☆10Jun 23, 2020Updated 6 years ago
HansiZeng / scaling-retriever
View on GitHub
[SIGIR 2025] The official repo for "Scaling Sparse and Dense Retrieval in Decoder-Only LLMs"
☆22Mar 31, 2025Updated last year
joonkeekim / Instructive-Decoding
View on GitHub
Official repository of "Distort, Distract, Decode: Instruction-Tuned Model Can Refine its Response from Noisy Instructions", ICLR 2024 Sp…
☆21Mar 7, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
OPTML-Group / DP4TL
View on GitHub
[NeurIPS2023] "Selectivity Drives Productivity: Efficient Dataset Pruning for Enhanced Transfer Learning" by Yihua Zhang*, Yimeng Zhang*,…
☆14Oct 12, 2023Updated 2 years ago
ruizheng20 / gpo
View on GitHub
The code of paper "Toward Optimal LLM Alignments Using Two-Player Games".
☆17Jun 20, 2024Updated 2 years ago
Arvid-pku / ATOKE
View on GitHub
[AAAI 2024] History Matters: Temporal Knowledge Editing in Large Language Model
☆13Dec 17, 2023Updated 2 years ago
sungnyun / avsr-temporal-dynamics
View on GitHub
(SLT 2024) Learning Video Temporal Dynamics with Cross-Modal Attention for Robust Audio-Visual Speech Recognition
☆13Oct 22, 2024Updated last year
TREC-RAG / trec-rag.github.io
View on GitHub
Website for TREC RAG
☆14Updated this week
SungjoonPark / DeepNLP2
View on GitHub
Deep NLP 2 (2019.3-5)
☆10Feb 19, 2019Updated 7 years ago
jinkilee / hello-transformer
View on GitHub
☆11Aug 6, 2022Updated 3 years ago