YangLinyi / GLUE-XLinks
We leverage 14 datasets as OOD test data and conduct evaluations on 8 NLU tasks over 21 popularly used models. Our findings confirm that the OOD accuracy in NLP tasks needs to be paid more attention to since the significant performance decay compared to ID accuracy has been found in all settings.
☆93Updated last year
Alternatives and similar repositories for GLUE-X
Users that are interested in GLUE-X are comparing it to the libraries listed below
Sorting:
- This includes the original implementation of CtrlA: Adaptive Retrieval-Augmented Generation via Inherent Control.☆62Updated 9 months ago
- [EMNLP 2024] DA-Code: Agent Data Science Code Generation Benchmark for Large Language Models☆74Updated 3 weeks ago
- [ACL 2025 main] SCAR: Data Selection via Style Consistency-Aware Response Ranking for Efficient Instruction-Tuning of Large Language Mode…☆36Updated last week
- A Unified Intermediate Representation for Graph Query Languages☆68Updated 2 years ago
- ☆102Updated 2 years ago
- This tool(enhance_long) aims to enhance the LlaMa2 long context extrapolation capability in the lowest-cost approach, preferably without …☆45Updated last year
- Hybrid Latent Reasoning via Reinforcement Learning☆142Updated 2 months ago
- [COLING'22] Code for "SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER"☆46Updated 10 months ago
- MPLSandbox is an out-of-the-box multi-programming language sandbox designed to provide unified and comprehensive feedback from compiler a…☆178Updated 3 months ago
- An Extensible Framework for Retrieval-Augmented LLM Applications: Learning Relevance Beyond Simple Similarity.☆39Updated 7 months ago
- LLM Benchmark for Code☆30Updated last year
- A general AI agent framework that can be adapted to various tasks and environments.☆101Updated 6 months ago
- Search and Refine During Think: Autonomous Retrieval‑Augmented Reasoning of LLMs☆89Updated last month
- ☆48Updated 9 months ago
- Official Implementation of "Pay Attention to What You Need"☆43Updated 5 months ago
- This repo contains my customised style python based plots for NLP papers, and includes my reproduction for my favourite papers' plots☆39Updated last year
- RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response☆42Updated 7 months ago
- ☆38Updated 3 weeks ago
- ☆45Updated last year
- Code of Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Ne…☆26Updated last year
- SQL-o1: A Self-Reward Heuristic Dynamic Search Method for Text-to-SQL☆192Updated 2 months ago
- [ACL'23] Code for "SANTA: Separate Strategies for Inaccurate and Incomplete Annotation Noise in Distantly-Supervised Named Entity Recogni…☆40Updated 3 months ago
- A curated list of awesome papers related to adversarial attacks and defenses for information retrieval. If I missed any papers, feel free…☆218Updated last year
- A collection of papers related to knowledge fusion☆57Updated 9 months ago
- Your efficient and accurate answer verification system for RL training.☆34Updated last month
- Counterfactual-inference-based Text-classification Debiasing Framework.☆83Updated 4 years ago
- ☆100Updated 3 weeks ago
- [NeurIPS 2024] EffiBench: Benchmarking the Efficiency of Automatically Generated Code☆55Updated 8 months ago
- A library for generating difficulty-scalable, multi-tool, and verifiable agentic tasks with execution trajectories.☆141Updated 3 weeks ago
- Author: Wenhao Yu (wyu1@nd.edu). ACL 2022 Dict-BERT: Enhancing Language Model Pre-training with Dictionary☆41Updated 2 years ago