☆38Jan 17, 2025Updated last year
Alternatives and similar repositories for StanfordClashEval
Users that are interested in StanfordClashEval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Oct 17, 2024Updated last year
- [ICLR'24 Spotlight] "Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts"☆80Apr 12, 2024Updated last year
- Open sourced result for The Agent Company☆21Nov 11, 2025Updated 4 months ago
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆153Sep 21, 2024Updated last year
- [KDD24-ADS] R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models☆11Apr 9, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- Accompanying code for "Boosted Prompt Ensembles for Large Language Models"☆30Apr 13, 2023Updated 2 years ago
- Leaderboard of Frontier Models for Program Repair https://repairbench.github.io/☆11Oct 26, 2025Updated 5 months ago
- Text generation from structured data☆10Dec 2, 2019Updated 6 years ago
- ☆16May 14, 2025Updated 10 months ago
- (NeurIPS 2024) One-shot Federated Learning via Synthetic Distiller-Distillate Communication☆18Mar 11, 2025Updated last year
- Code accompanying "How I learned to start worrying about prompt formatting".☆118Jun 8, 2025Updated 9 months ago
- Radiology Language Evaluations☆11Nov 17, 2023Updated 2 years ago
- [ICLR 2025] BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆195Sep 13, 2025Updated 6 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Danmuku dataset☆11Jul 7, 2023Updated 2 years ago
- [NeurIPS'24] UDA: A Benchmark Suite for Retrieval Augmented Generation in Real-world Document Analysis☆43Feb 21, 2025Updated last year
- ☆12Mar 7, 2024Updated 2 years ago
- Official Repository for the ICLR 2022 paper "Generalization of Neural Combinatorial Solvers through the Lens of Adversarial Robustness"☆13Nov 20, 2022Updated 3 years ago
- [EMNLP 2024 Findings] ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs☆29May 22, 2025Updated 10 months ago
- A Workbench for Autograding Retrieve/Generate Systems☆15Jun 30, 2025Updated 9 months ago
- Official code for "Evaluations of Machine Learning Privacy Defenses are Misleading" (https://arxiv.org/abs/2404.17399)☆12Apr 29, 2024Updated last year
- Offcial Repo of Paper "Eliminating Position Bias of Language Models: A Mechanistic Approach""☆21Jun 13, 2025Updated 9 months ago
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆32Feb 26, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Unofficial implementation of the Ask-LLM paper 'How to Train Data-Efficient LLMs', arXiv:2402.09668.☆12Jun 19, 2024Updated last year
- ☆14Oct 12, 2024Updated last year
- Test equality between a black-box LLM API and a reference distribution☆12Oct 29, 2024Updated last year
- Project for restoring beautiful K-pop Idols Images to high quality.☆14Mar 19, 2023Updated 3 years ago
- ☆24Feb 4, 2026Updated last month
- [ICLR 2026] Official Implementation of ProxyThinker: Test-Time Guidance through Small Visual Reasoners.☆21Sep 24, 2025Updated 6 months ago
- Fine tuning of the Retrieval-Augmented Generation (RAG) with a custom knowledge source.☆13Feb 10, 2021Updated 5 years ago
- A version of the SUSTAIN model of category learning (Love, Medin, & Gureckis, 2004) implemented in Python☆20Apr 20, 2016Updated 9 years ago
- (NeurIPS 2025 🔥) Official implementation for "Efficient Multi-modal Large Language Models via Progressive Consistency Distillation"☆46Feb 11, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- ☆17Oct 2, 2024Updated last year
- Corresponding code to "Improving Robustness of ML Classifiers against Realizable Evasion Attacks Using Conserved Features" @ USENIX Secur…☆11Aug 5, 2019Updated 6 years ago
- ☆13Apr 18, 2024Updated last year
- ☆17Mar 17, 2020Updated 6 years ago
- PyTorch code for Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles (DANCE)☆23Nov 29, 2022Updated 3 years ago
- Code for COLING 2020 paper "Improving Document-level Sentiment Analysis with User and Product Context"☆11Apr 13, 2022Updated 3 years ago
- SemEval2026 Task 3 DimABSA☆31Mar 13, 2026Updated 2 weeks ago