unslothai / cut-cross-entropyLinks

Apple's Cut Cross Entropy

☆21

Alternatives and similar repositories for cut-cross-entropy

Users that are interested in cut-cross-entropy are comparing it to the libraries listed below

Sorting:

leloykun / modded-nanogpt
NanoGPT (124M) quality in 2.67B tokens
☆28Updated last month
mistralai / mistral-evals
☆77Updated 2 months ago
tyler-romero / microR1
Simple repository for training small reasoning models
☆40Updated 8 months ago
allenai / olmo-cookbook
OLMost every training recipe you need to perform data interventions with the OLMo family of models.
☆50Updated last week
fangyuan-ksgk / Tiny-GRPO
minimal GRPO implementation from scratch
☆98Updated 7 months ago
foundation-model-stack / bamba
Train, tune, and infer Bamba model
☆134Updated 4 months ago
TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆78Updated last year
Zyphra / Zamba2
PyTorch implementation of models from the Zamba2 series.
☆185Updated 8 months ago
Danau5tin / calculator_agent_rl
Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.
☆56Updated 5 months ago
Zyphra / Zyda_processing
☆39Updated last year
joey00072 / ohara
Collection of autoregressive model implementation
☆86Updated 5 months ago
arcee-ai / DAM
☆55Updated 11 months ago
bespokelabsai / verifiers
Verifiers for LLM Reinforcement Learning
☆76Updated 6 months ago
ContextualAI / CLAIR_and_APO
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆60Updated last year
gkamradt / SnakeBench
☆93Updated 4 months ago
akjindal53244 / Arithmo
Small and Efficient Mathematical Reasoning LLMs
☆72Updated last year
Edward-Sun / gpt-accelera
Simple and efficient pytorch-native transformer training and inference (batched)
☆78Updated last year
mlfoundations / scaling
Language models scale reliably with over-training and on downstream tasks
☆100Updated last year
casper-hansen / OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆172Updated 9 months ago
letta-ai / sleep-time-compute
accompanying material for sleep-time compute paper
☆117Updated 5 months ago
flowersteam / SOAR
Implementation of SOAR
☆42Updated last month
goncalorafaria / qalign
QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.
☆24Updated last month
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆108Updated 7 months ago
AIMO-CMU-MATH / CMU_MATH-AIMO
☆76Updated last year
facebookresearch / memory
Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…
☆342Updated 10 months ago
lucidrains / infini-transformer-pytorch
Implementation of Infini-Transformer in Pytorch
☆113Updated 9 months ago
rsshyam / GRPO
☆67Updated last year
ScalingIntelligence / large_language_monkeys
☆107Updated last year
golololologol / LLM-Distillery
A pipeline for LLM knowledge distillation
☆109Updated 6 months ago
devvrit / matformer
MatFormer repo
☆63Updated 10 months ago