neelsjain / BYODLinks

The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"

☆107

Alternatives and similar repositories for BYOD

Users that are interested in BYOD are comparing it to the libraries listed below

Sorting:

ruiqi-zhong / D5
The GitHub repo for Goal Driven Discovery of Distributional Differences via Language Descriptions
☆70Updated 2 years ago
QingruZhang / PASTA
PASTA: Post-hoc Attention Steering for LLMs
☆122Updated 8 months ago
EleutherAI / semantic-memorization
☆44Updated 8 months ago
bloomberg / dataless-model-merging
Code release for Dataless Knowledge Fusion by Merging Weights of Language Models (https://openreview.net/forum?id=FCnohuR6AnM)
☆89Updated 2 years ago
LoryPack / LLM-LieDetector
Code for the ICLR 2024 paper "How to catch an AI liar: Lie detection in black-box LLMs by asking unrelated questions"
☆71Updated last year
SALT-NLP / demonstrated-feedback
☆124Updated 10 months ago
hadasah / btm
☆75Updated last year
Edward-Sun / RECITE
Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkI
☆94Updated 2 years ago
kernelmachine / silo-lm
SILO Language Models code repository
☆81Updated last year
cambridgeltl / PairS
Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024)
☆47Updated 6 months ago
ahans30 / goldfish-loss
[NeurIPS 2024] Goldfish Loss: Mitigating Memorization in Generative LLMs
☆91Updated 8 months ago
r-three / phatgoose
Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"
☆86Updated last year
seonghyeonye / TAPP
[AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Following
☆79Updated 10 months ago
guy-dar / embedding-space
☆54Updated 2 years ago
evandez / REMEDI
Inspecting and Editing Knowledge Representations in Language Models
☆116Updated 2 years ago
XiangLi1999 / AutoBencher
☆29Updated last year
veronica320 / Faithful-COT
Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".
☆162Updated last year
chaitanyamalaviya / ExpertQA
[Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers
☆131Updated last year
kernelmachine / cbtm
Code repository for the c-BTM paper
☆107Updated last year
OSU-NLP-Group / AttrScore
Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"
☆56Updated 2 years ago
neulab / gemini-benchmark
☆149Updated last year
KaiNylund / lm-weights-encode-time
☆69Updated 11 months ago
IBM / SALMON
Self-Alignment with Principle-Following Reward Models
☆162Updated 2 months ago
yidingjiang / ado
The repository contains code for Adaptive Data Optimization
☆25Updated 7 months ago
HazyResearch / TART
TART: A plug-and-play Transformer module for task-agnostic reasoning
☆200Updated 2 years ago
casmlab / NPHardEval
Repository for NPHardEval, a quantified-dynamic benchmark of LLMs
☆57Updated last year
salesforce / factualNLG
Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"
☆59Updated 6 months ago
google / sycophancy-intervention
Scripts for generating synthetic finetuning data for reducing sycophancy.
☆113Updated last year
facebookresearch / Shepherd
This is the repo for the paper Shepherd -- A Critic for Language Model Generation
☆219Updated last year
seonghyeonye / Flipped-Learning
[ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
☆116Updated last month