ZetangForward / L-CITEEVAL

L-CITEEVAL: DO LONG-CONTEXT MODELS TRULY LEVERAGE CONTEXT FOR RESPONDING?

☆23

Alternatives and similar repositories for L-CITEEVAL

Users that are interested in L-CITEEVAL are comparing it to the libraries listed below

Sorting:

OpenNLG / OpenBA-v2
OpenBA-V2: 3B LLM (Large Language Model) with T5 architecture, utilizing model pruning technique and continuing pretraining from OpenBA-1…
☆25Updated last year
OpenLMLab / LongWanjuan
Towards Systematic Measurement for Long Text Quality
☆34Updated 8 months ago
meowpass / FollowComplexInstruction
Official implementation of the paper "From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large L…
☆48Updated 10 months ago
ChengpengLi1003 / DotaMath
☆29Updated 4 months ago
chtmp223 / suri
Suri: Multi-constraint instruction following for long-form text generation (EMNLP’24)
☆22Updated 6 months ago
HillZhang1999 / ICD
Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced Hallucinations"
☆63Updated last year
October2001 / ProLong
[ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models
☆55Updated 9 months ago
halfrot / ALaRM
[ACL 2024] Code for the paper "ALaRM: Align Language Models via Hierarchical Rewards Modeling"
☆25Updated last year
Spico197 / MoE-SFT
🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts
☆38Updated 7 months ago
wwxu21 / CUT
Source code of "Reasons to Reject? Aligning Language Models with Judgments"
☆58Updated last year
ernie-research / Tool-Augmented-Reward-Model
[ICLR'24 spotlight] Tool-Augmented Reward Modeling
☆47Updated 4 months ago
yyDing1 / ScaleQuest
We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.
☆62Updated 6 months ago
chujiezheng / LLM-Extrapolation
Official repository for paper "Weak-to-Strong Extrapolation Expedites Alignment"
☆74Updated 11 months ago
OpenMOSS / Say-I-Dont-Know
[ICML'2024] Can AI Assistants Know What They Don't Know?
☆80Updated last year
jinzhuoran / RAG-RewardBench
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment
☆16Updated 4 months ago
songmzhang / DSKD
Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models". A general white-box KD framework for both same…
☆49Updated 6 months ago
ChaosCodes / ProPETL
One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning
☆39Updated last year
HarlynDN / WebCiteS
[ACL'24] WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with Citations
☆12Updated 8 months ago
KbsdJames / omni-math-rule
The rule-based evaluation subset and code implementation of Omni-MATH
☆21Updated 4 months ago
GAIR-NLP / weak-to-strong-reasoning
☆59Updated 8 months ago
YJiangcm / LTE
[ACL 2024] Learning to Edit: Aligning LLMs with Knowledge Editing
☆35Updated 8 months ago
zhaochen0110 / conflictbank
Code and data for "ConflictBank: A Benchmark for Evaluating the Influence of Knowledge Conflicts in LLM" (NeurIPS 2024 Track Datasets and…
☆42Updated 6 months ago
GAIR-NLP / BeHonest
BeHonest: Benchmarking Honesty in Large Language Models
☆31Updated 9 months ago
yunx-z / COMBO
Merging Generated and Retrieved Knowledge for Open-Domain QA (EMNLP 2023)
☆22Updated last year
HKUNLP / STRING
[ICLR'25] Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"
☆75Updated 5 months ago
PKU-Baichuan-MLSystemLab / SysBench
SysBench: Can Large Language Models Follow System Messages?
☆29Updated 8 months ago
hanqi-qi / Mirror
☆12Updated last year
hahahawu / Long-to-Short-via-Model-Merging
Model merging is a highly efficient approach for long-to-short reasoning.
☆46Updated last month
iwangjian / TopDial
Target-oriented Proactive Dialogue Systems with Personalization: Problem Formulation and Dataset Curation (EMNLP 2023)
☆30Updated last year
princeton-pli / LongProc
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation
☆23Updated 3 weeks ago