WisdomShell / FreeEvalLinks
☆17Updated 10 months ago
Alternatives and similar repositories for FreeEval
Users that are interested in FreeEval are comparing it to the libraries listed below
Sorting:
- [ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models☆36Updated 11 months ago
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆126Updated 9 months ago
- Codes for the EMNLP 2023 Findings paper "Self-Polish: Enhance Reasoning in Large Language Models via Problem Refining" by Zhiheng Xi, Sen…☆30Updated 2 years ago
- ☆133Updated 9 months ago
- LLM hallucination paper list☆318Updated last year
- ☆19Updated last year
- R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)☆77Updated last month
- [ACL 2024] An Easy-to-use Hallucination Detection Framework for LLMs.☆34Updated 4 months ago
- ☆66Updated 4 months ago
- Repository for Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning☆164Updated last year
- 【ACL 2024】 SALAD benchmark & MD-Judge☆150Updated 3 months ago
- Repo for paper "Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents"☆52Updated last year
- Exploiting Unlabeled Data for Target-Oriented Opinion Words Extraction☆24Updated 2 years ago
- A method of ensemble learning for heterogeneous large language models.☆58Updated 10 months ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆228Updated 3 weeks ago
- Controllable Text Generation for Large Language Models: A Survey☆179Updated 10 months ago
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆122Updated 7 months ago
- Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'☆27Updated last month
- Code for "Small Models are Valuable Plug-ins for Large Language Models"☆129Updated 2 years ago
- The repository for paper <Evaluating Open-QA Evaluation>☆24Updated last year
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.☆42Updated last year
- [COLM'24] Corex: Pushing the Boundaries of Complex Reasoning through Multi-Model Collaboration☆29Updated 8 months ago
- LLM Unlearning☆169Updated last year
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆94Updated last year
- [COLING 2025] ToolEyes: Fine-Grained Evaluation for Tool Learning Capabilities of Large Language Models in Real-world Scenarios☆68Updated last month
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios☆58Updated last year
- A curated list of Awesome-LLM-Ensemble papers for the survey "Harnessing Multiple Large Language Models: A Survey on LLM Ensemble"☆73Updated last week
- ☆141Updated last year
- [ICLR 2025] Released code for paper "Spurious Forgetting in Continual Learning of Language Models"☆47Updated last month
- EMNLP'23 survey: a curation of awesome papers and resources on refreshing large language models (LLMs) without expensive retraining.☆133Updated last year