WisdomShell / FreeEval
☆16Updated 9 months ago
Alternatives and similar repositories for FreeEval:
Users that are interested in FreeEval are comparing it to the libraries listed below
- [ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models☆36Updated 9 months ago
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆114Updated 7 months ago
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆203Updated last week
- ☆14Updated last year
- ☆117Updated 7 months ago
- Controllable Text Generation for Large Language Models: A Survey☆171Updated 8 months ago
- Repository for Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning☆162Updated last year
- LLM hallucination paper list☆315Updated last year
- The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>☆339Updated last year
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆119Updated 6 months ago
- A method of ensemble learning for heterogeneous large language models.☆58Updated 8 months ago
- Paper collections of retrieval-based (augmented) language model.☆232Updated 11 months ago
- 【ACL 2024】 SALAD benchmark & MD-Judge☆145Updated last month
- ☆62Updated 3 months ago
- Code associated with Tuning Language Models by Proxy (Liu et al., 2024)☆109Updated last year
- ☆19Updated 11 months ago
- Large Language Models Meet NL2Code: A Survey☆36Updated 5 months ago
- The demo, code and data of FollowRAG☆72Updated last week
- Large Language Models(LLMs) of Code☆17Updated 2 years ago
- [ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning☆92Updated 11 months ago
- [ICLR 2025] Released code for paper "Spurious Forgetting in Continual Learning of Language Models"☆38Updated 2 months ago
- The repository for paper <Evaluating Open-QA Evaluation>☆24Updated last year
- The code and data of DPA-RAG, accepted by WWW 2025 main conference.☆60Updated 3 months ago
- ☆55Updated 6 months ago
- A versatile toolkit for applying Logit Lens to modern large language models (LLMs). Currently supports Llama-3.1-8B and Qwen-2.5-7B, enab…☆75Updated 2 months ago
- Awesome LLM Self-Consistency: a curated list of Self-consistency in Large Language Models☆96Updated 8 months ago
- Chain of Thoughts (CoT) is so hot! so long! We need short reasoning process!☆50Updated last month
- [ACL2024] Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios☆56Updated last year
- ☆71Updated last year
- ☆32Updated 7 months ago