vivien000 / regex-constrained-decodingLinks
Fast, High-Fidelity LLM Decoding with Regex Constraints
☆20Updated 11 months ago
Alternatives and similar repositories for regex-constrained-decoding
Users that are interested in regex-constrained-decoding are comparing it to the libraries listed below
Sorting:
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- Project code for training LLMs to write better unit tests + code☆20Updated last month
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆53Updated 4 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆66Updated 2 months ago
- ☆35Updated last year
- Simple repository for training small reasoning models☆33Updated 4 months ago
- ☆61Updated last week
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated last year
- Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"☆23Updated 3 weeks ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆32Updated 2 months ago
- ☆30Updated 7 months ago
- ☆47Updated 9 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆17Updated 3 months ago
- Using FlexAttention to compute attention with different masking patterns☆44Updated 9 months ago
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆35Updated this week
- Simple GRPO scripts and configurations.☆58Updated 4 months ago
- rl from zero pretrain, can it be done? we'll see.☆56Updated this week
- ☆51Updated 7 months ago
- Lego for GRPO☆28Updated last month
- An introduction to LLM Sampling☆78Updated 6 months ago
- Latent Large Language Models☆18Updated 10 months ago
- ☆47Updated 4 months ago
- AGI API for the GR platform☆18Updated last month
- ☆61Updated 3 weeks ago
- A repository for research on medium sized language models.☆76Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆78Updated 9 months ago
- ☆49Updated last year
- ☆56Updated last month
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆58Updated last month
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year