thomasahle / cceLinks
Clustered Compositional Embeddings
โ11Updated last year
Alternatives and similar repositories for cce
Users that are interested in cce are comparing it to the libraries listed below
Sorting:
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAXโ89Updated last year
- A MAD laboratory to improve AI architecture designs ๐งชโ129Updated 9 months ago
- โ54Updated 11 months ago
- โ46Updated last year
- nanoGPT-like codebase for LLM trainingโ107Updated 4 months ago
- PyTorch implementation for MRLโ19Updated last year
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignmentโ59Updated last year
- โ82Updated last year
- โ58Updated last year
- Minimum Description Length probing for neural network representationsโ20Updated 8 months ago
- โ53Updated last year
- โ33Updated last year
- Evaluation of neuro-symbolic enginesโ39Updated last year
- some common Huggingface transformers in maximal update parametrization (ยตP)โ82Updated 3 years ago
- โ35Updated 10 months ago
- Triton Implementation of HyperAttention Algorithmโ48Updated last year
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.โ19Updated 2 months ago
- Code for "Accelerating Training with Neuron Interaction and Nowcasting Networks" [to appear at ICLR 2025]โ20Updated this week
- Aioli: A unified optimization framework for language model data mixingโ27Updated 8 months ago
- gzip Predicts Data-dependent Scaling Lawsโ34Updated last year
- โ13Updated 4 months ago
- โ53Updated last year
- Simple repository for training small reasoning modelsโ40Updated 8 months ago
- โ13Updated 7 months ago
- โ142Updated 3 weeks ago
- Simple GRPO scripts and configurations.โ59Updated 8 months ago
- Using FlexAttention to compute attention with different masking patternsโ44Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.โ164Updated 3 months ago
- An annotated implementation of the Hyena Hierarchy paperโ34Updated 2 years ago
- LLM training in simple, raw C/CUDAโ15Updated 10 months ago