WisdomShell / FreeEvalLinks

☆17

Alternatives and similar repositories for FreeEval

Users that are interested in FreeEval are comparing it to the libraries listed below

Sorting:

zhuohaoyu / KIEval
[ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
☆36Updated 11 months ago
pillowsofwind / Knowledge-Conflicts-Survey
[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"
☆126Updated 9 months ago
WooooDyy / Self-Polish
Codes for the EMNLP 2023 Findings paper "Self-Polish: Enhance Reasoning in Large Language Models via Problem Refining" by Zhiheng Xi, Sen…
☆30Updated 2 years ago
AmourWaltz / Reliable-LLM
☆133Updated 9 months ago
LuckyyySTA / Awesome-LLM-hallucination
LLM hallucination paper list
☆318Updated last year
epang-ucas / Evaluate_LLMs_to_Genes
☆19Updated last year
Lordog / R-Judge
R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)
☆77Updated last month
zjunlp / EasyDetect
[ACL 2024] An Easy-to-use Hallucination Detection Framework for LLMs.
☆34Updated 4 months ago
qianlanwyd / paper-citation-ranking
☆66Updated 4 months ago
lancopku / label-words-are-anchors
Repository for Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning
☆164Updated last year
OpenSafetyLab / SALAD-BENCH
【ACL 2024】 SALAD benchmark & MD-Judge
☆150Updated 3 months ago
OpenBMB / Tell_Me_More
Repo for paper "Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents"
☆52Updated last year
TOWESSL / TOWESSL
Exploiting Unlabeled Data for Target-Oriented Opinion Words Extraction
☆24Updated 2 years ago
OrangeInSouth / DeePEn
A method of ensemble learning for heterogeneous large language models.
☆58Updated 10 months ago
Hongcheng-Gao / Awesome-Long2short-on-LRMs
Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…
☆228Updated 3 weeks ago
IAAR-Shanghai / CTGSurvey
Controllable Text Generation for Large Language Models: A Survey
☆179Updated 10 months ago
sail-sg / sdft
[ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".
☆122Updated 7 months ago
RUCKBReasoning / CoT-based-Synthesizer
Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'
☆27Updated last month
JetRunner / SuperICL
Code for "Small Models are Valuable Plug-ins for Large Language Models"
☆129Updated 2 years ago
wangcunxiang / QA-Eval
The repository for paper <Evaluating Open-QA Evaluation>
☆24Updated last year
thu-coai / AutoDetect
Official github repo for AutoDetect, an automated weakness detection framework for LLMs.
☆42Updated last year
QiushiSun / Corex
[COLM'24] Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration
☆29Updated 8 months ago
kevinyaobytedance / llm_unlearn
LLM Unlearning
☆169Updated last year
SafeAILab / RAIN
[ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning
☆94Updated last year
Junjie-Ye / ToolEyes
[COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios
☆68Updated last month
JoeYing1019 / UltraTool
[ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios
☆58Updated last year
junchenzhi / Awesome-LLM-Ensemble
A curated list of Awesome-LLM-Ensemble papers for the survey "Harnessing Multiple Large Language Models: A Survey on LLM Ensemble"
☆73Updated last week
CASIA-LM / MoDS
☆141Updated last year
zzz47zzz / spurious-forgetting
[ICLR 2025] Released code for paper "Spurious Forgetting in Continual Learning of Language Models"
☆47Updated last month
hyintell / awesome-refreshing-llms
EMNLP'23 survey: a curation of awesome papers and resources on refreshing large language models (LLMs) without expensive retraining.
☆133Updated last year