blcuicall / OMGEvalLinks
OMGEval๐ฎ: An Open Multilingual Generative Evaluation Benchmark for Foundation Models
โ35Updated last year
Alternatives and similar repositories for OMGEval
Users that are interested in OMGEval are comparing it to the libraries listed below
Sorting:
- EMNLP'2023: Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Explorationโ36Updated last year
- Code and data for the paper "Can Large Language Models Understand Real-World Complex Instructions?"(AAAI2024)โ49Updated last year
- โ29Updated 2 years ago
- ๐ฉบ A collection of ChatGPT evaluation reports on various bechmarks.โ50Updated 2 years ago
- [ACL 2024] FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Modelsโ113Updated 3 months ago
- Collection of papers for scalable automated alignment.โ93Updated 10 months ago
- โ56Updated last year
- Official Implementation of "Probing Language Models for Pre-training Data Detection"โ19Updated 9 months ago
- EMNLP'2024: Knowledge Verification to Nip Hallucination in the Budโ21Updated last year
- Code & Data for our Paper "RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation" (EMNLP 2023)โ17Updated last year
- A retrieval augmented sequence modeling toolkit implemented based on Fairseqโ29Updated 2 years ago
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"โ134Updated last year
- Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)โ93Updated 7 months ago
- ACL2023 (Oral): TemplateGEC: Improving Grammatical Error Correction with Detection Templateโ22Updated 2 years ago
- code for Teaching LM to Translate with Comparisonโ39Updated last year
- CFBench: A Comprehensive Constraints-Following Benchmark for LLMsโ42Updated last year
- an easy-to-use knn-mt toolkitโ104Updated 2 years ago
- โ84Updated 8 months ago
- The repository for paper <Evaluating Open-QA Evaluation>โ25Updated last year
- Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"โ58Updated 3 years ago
- [ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Modelsโ56Updated last year
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuningโ175Updated 2 months ago
- Implementation of "Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation"โ82Updated 2 years ago
- [ChatGPT4MTevaluation] ErrorAnalysis Prompt for MT Evaluation in ChatGPTโ89Updated last week
- ๐ An unofficial implementation of Self-Alignment with Instruction Backtranslation.โ139Updated 4 months ago
- โ33Updated last year
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimationโ89Updated 10 months ago
- [Findings of ACL'2023] Improving Contrastive Learning of Sentence Embeddings from AI Feedbackโ40Updated 2 years ago
- โ145Updated last year
- Target-oriented Proactive Dialogue Systems with Personalization: Problem Formulation and Dataset Curation (EMNLP 2023)โ30Updated last year