lyt719 / LLM-evaluation-datasetsLinks
☆35Updated last year
Alternatives and similar repositories for LLM-evaluation-datasets
Users that are interested in LLM-evaluation-datasets are comparing it to the libraries listed below
Sorting:
- Controllable Text Generation for Large Language Models: A Survey☆196Updated last year
- Official github repo for SafetyBench, a comprehensive benchmark to evaluate LLMs' safety. [ACL 2024]☆265Updated 4 months ago
- Flames is a highly adversarial benchmark in Chinese for LLM's harmlessness evaluation developed by Shanghai AI Lab and Fudan NLP Group.☆62Updated last year
- Templates and examples for ACL and EMNLP conference posters.☆14Updated last year
- ☆177Updated last year
- LLM hallucination paper list☆327Updated last year
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆150Updated last year
- Neural Code Intelligence Survey 2024; Reading lists and resources☆279Updated 4 months ago
- A collection of survey papers and resources related to Large Language Models (LLMs).☆40Updated last year
- code for paper "Discerning and Resolving Knowledge Conflicts through Adaptive Decoding with Contextual Information-Entropy Constraint"☆11Updated last year
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other mo…☆409Updated 5 months ago
- A live reading list for LLM data synthesis (Updated to July, 2025).☆424Updated 3 months ago
- Awesome papers for role-playing with language models☆215Updated last year
- ☆146Updated last year
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"☆136Updated last year
- 基于DPO算法微调语言大模型,简单好上手。☆49Updated last year
- 对llama3进行全参微调、lora微调以及qlora微调。☆212Updated last year
- R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (EMNLP Findings 2024)☆93Updated 7 months ago
- The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>☆340Updated last year
- Awesome-Long2short-on-LRMs is a collection of state-of-the-art, novel, exciting long2short methods on large reasoning models. It contains…☆254Updated 4 months ago
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆81Updated 2 years ago
- [ACL 2024] A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future☆478Updated 11 months ago
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.☆44Updated last year
- ☆352Updated last year
- Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation☆56Updated 11 months ago
- Large Language Models(LLMs) of Code☆19Updated 2 years ago
- Collection of training data management explorations for large language models☆336Updated last year
- An Awesome Collection for LLM Survey☆381Updated 6 months ago
- ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors [EMNLP 2024 Findings]☆218Updated last year
- ☆51Updated 9 months ago