blcuicall / OMGEvalLinks
OMGEval๐ฎ: An Open Multilingual Generative Evaluation Benchmark for Foundation Models
โ34Updated 11 months ago
Alternatives and similar repositories for OMGEval
Users that are interested in OMGEval are comparing it to the libraries listed below
Sorting:
- EMNLP'2023: Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Explorationโ36Updated last year
- โ29Updated 2 years ago
- ๐ฉบ A collection of ChatGPT evaluation reports on various bechmarks.โ49Updated 2 years ago
- [Findings of ACL'2023] Improving Contrastive Learning of Sentence Embeddings from AI Feedbackโ39Updated last year
- Dataset and evaluation script for "Evaluating Hallucinations in Chinese Large Language Models"โ130Updated last year
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimationโ85Updated 8 months ago
- โ33Updated last year
- โ55Updated 10 months ago
- Official Implementation of "Probing Language Models for Pre-training Data Detection"โ19Updated 7 months ago
- Code and data for the paper "Can Large Language Models Understand Real-World Complex Instructions?"(AAAI2024)โ48Updated last year
- ACL2023 (Oral): TemplateGEC: Improving Grammatical Error Correction with Detection Templateโ22Updated 2 years ago
- [ACL 2024] FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Modelsโ107Updated last month
- [ACL 23] CodeIE: Large Code Generation Models are Better Few-Shot Information Extractorsโ36Updated 5 months ago
- [ICML'2024] Can AI Assistants Know What They Don't Know?โ81Updated last year
- [ACL'24] MC^2: A Multilingual Corpus of Minority Languages in China (Tibetan, Uyghur, Kazakh, and Mongolian)โ25Updated 3 weeks ago
- code for Teaching LM to Translate with Comparisonโ39Updated last year
- Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)โ87Updated 4 months ago
- Collection of papers for scalable automated alignment.โ92Updated 8 months ago
- A retrieval augmented sequence modeling toolkit implemented based on Fairseqโ29Updated 2 years ago
- Code & Data for our Paper "RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation" (EMNLP 2023)โ17Updated last year
- Implementation of "Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation"โ80Updated last year
- โ17Updated 4 months ago
- The implementation for our paper, "Improving Simultaneous Machine Translation with Monolingual Data," accepted to AAAI 2023. ๐โ12Updated last year
- [ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialoguesโ99Updated 11 months ago
- ACL'2023: Multi-Task Pre-Training of Modular Prompt for Few-Shot Learningโ40Updated 2 years ago
- [ChatGPT4MTevaluation] ErrorAnalysis Prompt for MT Evaluation in ChatGPTโ89Updated last year
- ๐ An unofficial implementation of Self-Alignment with Instruction Backtranslation.โ140Updated 2 months ago
- โ142Updated last year
- CFBench: A Comprehensive Constraints-Following Benchmark for LLMsโ36Updated 10 months ago
- Target-oriented Proactive Dialogue Systems with Personalization: Problem Formulation and Dataset Curation (EMNLP 2023)โ31Updated last year