YuchenJin / llm.c

LLM training in simple, raw C/CUDA

☆12

Related projects ⓘ

Alternatives and complementary repositories for llm.c

drisspg / transformer_nuggets
A place to store reusable transformer components of my own creation or found on the interwebs
☆43Updated this week
sher222 / LeReT
Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval
☆24Updated last week
arcee-ai / DAM
☆38Updated this week
frankxwang / dpo-prefix-sharing
DPO, but faster 🚀
☆20Updated 2 weeks ago
xjdr-alt / muzero_sketch
☆36Updated 3 months ago
google-research-datasets / QAmeleon
QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…
☆34Updated last year
krypticmouse / matryoshka-representation-learning
PyTorch implementation for MRL
☆18Updated 8 months ago
Aleph-Alpha / trigrams
☆44Updated 2 months ago
CarperAI / treasure_trove
☆22Updated last year
google-deepmind / asyncdiloco
☆39Updated 9 months ago
amudide / switch_sae
Efficient Dictionary Learning with Switch Sparse Autoencoders (SAEs)
☆13Updated last month
Zyphra / Zyda_processing
☆26Updated 4 months ago
joey00072 / ohara
Collection of autoregressive model implementation
☆66Updated last week
ContextualAI / CLAIR_and_APO
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆46Updated 2 months ago
RobertCsordas / moeut
☆61Updated 2 months ago
SeunghyunSEO / optimized_hf_llama_class_for_training
☆44Updated 2 months ago
TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆74Updated 5 months ago
joey00072 / microjax
Jax like function transformation engine but micro, microjax
☆26Updated 2 weeks ago
andrew-silva / clean-rl-mlx
Clean RL implementation using MLX
☆26Updated 8 months ago
pchizhov / picky_bpe
BPE modification that implements removing of the intermediate tokens during tokenizer training.
☆22Updated 2 months ago
KhoomeiK / complexity-scaling
gzip Predicts Data-dependent Scaling Laws
☆32Updated 5 months ago
para-lost / ReBase
ReBase: Training Task Experts through Retrieval Based Distillation
☆27Updated 3 months ago
modal-labs / ci-on-modal
A sample pattern for running CI tests on Modal
☆13Updated last month
RobertCsordas / moe
Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"
☆36Updated 11 months ago
jxmorris12 / cde
code for training & evaluating Contextual Document Embedding models
☆93Updated this week
yidingjiang / ado
The repository contains code for Adaptive Data Optimization
☆18Updated 3 weeks ago
ElleLeonne / Lightning-ReLoRA
A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.
☆33Updated 8 months ago
haileyschoelkopf / triton-index
See https://github.com/cuda-mode/triton-index/ instead!
☆11Updated 6 months ago
AnswerDotAI / toolslm
Tools to make language models a bit easier to use
☆30Updated 2 weeks ago
apple / ml-hypercloning
☆34Updated last week