🩺 A collection of ChatGPT evaluation reports on various bechmarks.
☆50Mar 28, 2023Updated 3 years ago
Alternatives and similar repositories for awesome-lm-evaluation
Users that are interested in awesome-lm-evaluation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ✒️ ChatGPT as a writing partner.☆14Mar 6, 2023Updated 3 years ago
- ☆18Mar 10, 2023Updated 3 years ago
- Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"☆10Dec 13, 2024Updated last year
- 基于树形条件随机场的高阶句法分析☆16Apr 28, 2022Updated 4 years ago
- 🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts☆41Sep 29, 2024Updated last year
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Official Implementation of "Probing Language Models for Pre-training Data Detection"☆20Dec 4, 2024Updated last year
- The information of NLP PhD application in the world.☆37Aug 27, 2024Updated last year
- 儿童故事常识推理与寓意理解评测(Commonsense Reasoning and Moral Understanding Evaluation in Children's Stories,CRMU)☆18Oct 22, 2024Updated last year
- [EMNLP'23] Code for "Non-autoregressive Text Editing with Copy-aware Latent Alignments".☆20Oct 17, 2023Updated 2 years ago
- Code & Data for our Paper "RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation" (EMNLP 2023)☆17Jan 23, 2024Updated 2 years ago
- 🪞A powerful toolkit for almost all the Information Extraction tasks.☆124Apr 21, 2025Updated last year
- 😎 A simple and easy-to-use toolkit for GPU scheduling.☆45May 12, 2025Updated last year
- ☆20Jun 3, 2024Updated 2 years ago
- ☆14Aug 18, 2022Updated 3 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- [COLING'22] Code for "Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments".☆61Oct 8, 2023Updated 2 years ago
- [ICLR 2025] RaSA: Rank-Sharing Low-Rank Adaptation☆10May 19, 2025Updated last year
- Official Implementation of ACL2023: Don't Parse, Choose Spans! Continuous and Discontinuous Constituency Parsing via Autoregressive Span …☆14Aug 25, 2023Updated 2 years ago
- Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study☆42Mar 8, 2023Updated 3 years ago
- The repo of "Improving Seq2Seq Grammatical Error Correction via Decoding Interventions"☆32Jan 22, 2024Updated 2 years ago
- The Code & Paper for ACL 2023 paper "Enhancing Language Representation with Constructional Information for Natural Language Understanding…☆20Jan 18, 2025Updated last year
- ☆12May 6, 2024Updated 2 years ago
- This is the code for neural-Jacana aligner, and the data for MultiMWA dataset.☆20Feb 12, 2023Updated 3 years ago
- ☆13Feb 7, 2023Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- 格物-多语言和中文大规模预训练模型-轻量版,涵盖纯中文、知识增强、113个语种多语言,采用主流Roberta架构,适用于NLU和NLG任务, 支持pytorch、tensorflow、uer、huggingface等框架。 Multilingual and Chinese …☆29Nov 17, 2022Updated 3 years ago
- Code for "Small Models are Valuable Plug-ins for Large Language Models"☆132May 16, 2023Updated 3 years ago
- [ACL 2023] Are Pre-trained Language Models Useful for Model Ensemble in Chinese Grammatical Error Correction?☆10Dec 15, 2025Updated 6 months ago
- Code for embedding and retrieval research.☆16Oct 24, 2023Updated 2 years ago
- Calculate the probability of a paper being accepted by EMNLP2023 based on score distribution of ACL2023.☆14Sep 7, 2023Updated 2 years ago
- RACE is a multi-dimensional benchmark for code generation that focuses on Readability, mAintainability, Correctness, and Efficiency.☆14Oct 12, 2024Updated last year
- Awesome papers on Language-Model-as-a-Service (LMaaS)☆545May 14, 2024Updated 2 years ago
- ☆16Nov 5, 2018Updated 7 years ago
- [ICLR'25 Spotlight] Min-K%++: Improved baseline for detecting pre-training data of LLMs☆57May 26, 2025Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A curated list of awesome resources dedicated to Scaling Laws for LLMs☆84Apr 10, 2023Updated 3 years ago
- ☆14Jul 27, 2022Updated 3 years ago
- 苏州大学研究生学位论文模板 - Soochow University Thesis TeX Template☆22Feb 27, 2026Updated 3 months ago
- ☆22Apr 14, 2020Updated 6 years ago
- Use the tokenizer in parallel to achieve superior acceleration☆20Mar 21, 2024Updated 2 years ago
- Source code of ACL 2023 Main Conference Paper "PAD-Net: An Efficient Framework for Dynamic Networks".☆14Feb 28, 2026Updated 3 months ago
- Code and data for the paper: On the Reliability of Psychological Scales on Large Language Models☆30Dec 15, 2025Updated 6 months ago