Di-viner/LLM-Robustness-to-Irrelevant-Information

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Di-viner/LLM-Robustness-to-Irrelevant-Information)

Di-viner / LLM-Robustness-to-Irrelevant-Information

[COLM'24] How Easily do Irrelevant Inputs Skew the Responses of Large Language Models?

☆23

Alternatives and similar repositories for LLM-Robustness-to-Irrelevant-Information

Users that are interested in LLM-Robustness-to-Irrelevant-Information are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

TEAM-ARM / arm
View on GitHub
[NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model
☆68Apr 6, 2026Updated 3 months ago
language-agent-tutorial / language-agent-tutorial.github.io
View on GitHub
[EMNLP 2024 Tutorial] Language Agents: Foundations, Prospects, and Risks
☆10Nov 27, 2024Updated last year
gao-xiao-bai / JsonTuning
View on GitHub
JsonTuning: Towards Generalizable, Robust, and Controllable Instruction Tuning
☆10Nov 3, 2024Updated last year
RenzeLou / AAAR-1.0
View on GitHub
The source code for running LLMs on the AAAR-1.0 benchmark.
☆20Apr 5, 2025Updated last year
OSU-NLP-Group / LLM-Knowledge-Conflict
View on GitHub
[ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"
☆84Apr 12, 2024Updated 2 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
hexuandeng / Mono4SiMT
View on GitHub
The implementation for our paper, "Improving Simultaneous Machine Translation with Monolingual Data," accepted to AAAI 2023. 🎉
☆12Jul 19, 2023Updated 3 years ago
OSU-NLP-Group / Deductive-Beam-Search
View on GitHub
[COLM'24] "Deductive Beam Search: Decoding Deducible Rationale for Chain-of-Thought Reasoning"
☆21Jun 14, 2024Updated 2 years ago
NayMyatMin / CROW
View on GitHub
Internal Consistency Regularization (CROW) for LLM Backdoor Elimination - Paper accepted to ICML 2025
☆16May 6, 2025Updated last year
SunbowLiu / SurfaceFusion
View on GitHub
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)
☆24Mar 18, 2021Updated 5 years ago
m3yrin / aligned-cross-entropy
View on GitHub
Test implementation of "Aligned Cross Entropy for Non-Autoregressive Machine Translation" https://arxiv.org/abs/2004.01655
☆21Jul 25, 2024Updated 2 years ago
OSU-NLP-Group / llm-planning-eval
View on GitHub
[ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"
☆54Feb 23, 2024Updated 2 years ago
YujieLu10 / Seeker
View on GitHub
☆11May 24, 2024Updated 2 years ago
MattYoon / reasoning-models-confidence
View on GitHub
[NeurIPS 2025] Reasoning Models Better Express Their Confidence"
☆23Nov 19, 2025Updated 8 months ago
allenai / sso
View on GitHub
Repository for Skill Set Optimization
☆14Jul 26, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
simonepri / fever-transformers
View on GitHub
📄 Evidence Retrieval and Claim Verification for the FEVER shared task using Transformer Networks
☆12Feb 21, 2020Updated 6 years ago
Mosi-AI / M2RL
View on GitHub
☆16May 15, 2026Updated 2 months ago
Chengsong-Huang / Self-Calibration
View on GitHub
codes for Efficient Test-Time Scaling via Self-Calibration
☆20Sep 13, 2025Updated 10 months ago
qtli / GSM-Plus
View on GitHub
GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.
☆66Jul 8, 2024Updated 2 years ago
UM-Data-Intelligence-Lab / NYLON_code
View on GitHub
☆20Feb 18, 2024Updated 2 years ago
UM-Data-Intelligence-Lab / HELIOS_code
View on GitHub
☆20Oct 29, 2023Updated 2 years ago
siyuyuan / coscript
View on GitHub
Resources for our ACL 2023 paper: Distilling Script Knowledge from Large Language Models for Constrained Language Planning
☆36Aug 19, 2023Updated 2 years ago
casetext / r-and-r
View on GitHub
Code for the "Long Context Needs Some R&R" paper.
☆12Mar 11, 2024Updated 2 years ago
OSU-NLP-Group / QUEST
View on GitHub
"QUEST: Training Frontier Deep Research Agents with Fully Synthetic Tasks"
☆238Jul 22, 2026Updated last week
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
OSU-NLP-Group / Mind2Web-2
View on GitHub
[NeurIPS'25 D&B] Mind2Web-2 Benchmark: Evaluating Agentic Search with Agent-as-a-Judge
☆112May 17, 2026Updated 2 months ago
eXascaleInfolab / HistoSketch
View on GitHub
Implementation of HistoSketch and D2HistoSketch in MATLAB
☆19Aug 29, 2018Updated 7 years ago
Shawn-Guo-CN / Lossless_Text_Compression_with_Transformer
View on GitHub
This repo is to demo the concept of lossless compression with Transformers as encoder and decoder.
☆14May 2, 2024Updated 2 years ago
hsajjad / ConceptX
View on GitHub
Analyzing Latent Concept in Pre-trained Transformer Models
☆12Jul 18, 2022Updated 4 years ago
PathMMU-Benchmark / PathMMU
View on GitHub
☆39Dec 11, 2024Updated last year
crushr / EANN_Implemetation
View on GitHub
EANN(Pytorch)
☆10Mar 12, 2022Updated 4 years ago
TIGER-AI-Lab / GenAI-Arena
View on GitHub
Interface for GenAI-Arena [NeurIPS24]
☆16Feb 27, 2024Updated 2 years ago
rhyang2021 / ARIA
View on GitHub
Source code for our paper: "ARIA: Training Language Agents with Intention-Driven Reward Aggregation".
☆30Aug 9, 2025Updated 11 months ago
OSU-NLP-Group / AttrScore
View on GitHub
Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"
☆56Jul 3, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Pursue1221 / FlashbackPlusPlus
View on GitHub
☆24Mar 12, 2024Updated 2 years ago
dqwang122 / MLROUGE
View on GitHub
ROUGE for multilingual Summarization
☆25Oct 11, 2021Updated 4 years ago
jiangjiechen / HedModTmplGen
View on GitHub
Code for our ACL 2019 long paper: "Ensuring Readability and Data-fidelity using Head-modifier Templates in Deep Type Description Generati…
☆11Nov 5, 2022Updated 3 years ago
osome-iu / ChatGPT_domain_rating
View on GitHub
Code and data for paper "Large language models can rate news outlet credibility"
☆13Aug 10, 2024Updated last year
HITsz-TMG / SKURG
View on GitHub
☆20Nov 4, 2023Updated 2 years ago
allenai / faithful-nmn
View on GitHub
Evaluating and improving the faithfulness of the interpretations offered by Neural Module Networks
☆13Jun 12, 2023Updated 3 years ago
Xt-cyh / CoDI-Eval
View on GitHub
☆22May 7, 2025Updated last year