lilakk / BLEUBERILinks

Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"

☆23

Alternatives and similar repositories for BLEUBERI

Users that are interested in BLEUBERI are comparing it to the libraries listed below

Sorting:

ContextualAI / CLAIR_and_APO
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆57Updated 9 months ago
bespokelabsai / verifiers
Verifiers for LLM Reinforcement Learning
☆61Updated 2 months ago
Zyphra / Zyda_processing
☆35Updated last year
arcee-ai / DAM
☆51Updated 7 months ago
ElleLeonne / Lightning-ReLoRA
A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.
☆33Updated last year
HazyResearch / aioli
Aioli: A unified optimization framework for language model data mixing
☆27Updated 5 months ago
choosewhatulike / case2code
☆15Updated 2 months ago
AnswerDotAI / ModernBERT-Instruct-mini-cookbook
☆47Updated 4 months ago
Tufalabs / TextbooksToRL
☆20Updated 3 months ago
krypticmouse / matryoshka-representation-learning
PyTorch implementation for MRL
☆18Updated last year
austrian-code-wizard / c3po
☆27Updated this week
ZeroSumEval / ZeroSumEval
A framework for pitting LLMs against each other in an evolving library of games ⚔
☆32Updated 2 months ago
argilla-io / distilabel-spin-dibt
Repository containing the SPIN experiments on the DIBT 10k ranked prompts
☆24Updated last year
SeunghyunSEO / optimized_hf_llama_class_for_training
☆47Updated 10 months ago
minosvasilias / simple_grpo
Simple GRPO scripts and configurations.
☆58Updated 4 months ago
Knowledgator / FlashDeBERTa
Trully flash implementation of DeBERTa disentangled attention mechanism.
☆59Updated last month
allenai / infinigram-api
☆61Updated 3 weeks ago
NathanGodey / qfilters
Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)
☆33Updated 3 months ago
likenneth / q_probe
Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
☆41Updated last year
para-lost / ReBase
ReBase: Training Task Experts through Retrieval Based Distillation
☆29Updated 4 months ago
JHU-CLSP / RATIONALYST
Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044
☆33Updated 8 months ago
dinobby / MAgICoRE
☆24Updated 9 months ago
du-nlp-lab / MLR-Copilot
☆65Updated 2 months ago
john-hewitt / implicit-ins
Codebase for Instruction Following without Instruction Tuning
☆34Updated 9 months ago
kyegomez / Infini-attention
Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…
☆55Updated this week
Tomorrowdawn / top_nsigma
The official code repo and data hub of top_nsigma sampling strategy for LLMs.
☆26Updated 4 months ago
facebookresearch / lss_eval
This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…
☆31Updated last year
axolotl-ai-cloud / grpo_code
A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.
☆30Updated 2 months ago
penfever / wildchat-50m
Code, results and other artifacts from the paper introducing the WildChat-50m dataset and the Re-Wild model family.
☆29Updated 2 months ago
ctlllll / understanding_llm_benchmarks
Understanding the correlation between different LLM benchmarks
☆29Updated last year