attentionmech / dexLinks
Pokedex for LLMs
☆13Updated 2 months ago
Alternatives and similar repositories for dex
Users that are interested in dex are comparing it to the libraries listed below
Sorting:
- Code for the paper "Function-Space Learning Rates"☆20Updated 3 weeks ago
- Tiny evaluation of leading LLMs on competitive programming problems☆14Updated 7 months ago
- alternative way to calculating self attention☆18Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- ☆10Updated 2 months ago
- ☆21Updated 3 months ago
- Implementation of a holodeck, written in Pytorch☆18Updated last year
- ☆22Updated last month
- Jax like function transformation engine but micro, microjax☆32Updated 8 months ago
- Describe the format of image/text datasets☆11Updated 3 years ago
- ☆21Updated last month
- LLM attention pattern visualizer☆10Updated last year
- A sample pattern for running CI tests on Modal☆18Updated 2 months ago
- Minimum Description Length probing for neural network representations☆18Updated 5 months ago
- ☆23Updated last month
- ☆20Updated last year
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆30Updated 3 weeks ago
- ☆38Updated 11 months ago
- Utilities for PyTorch distributed☆24Updated 4 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆17Updated 3 months ago
- BH hackathon☆14Updated last year
- Pixel Parsing. A reproduction of OCR-free end-to-end document understanding models with open data☆21Updated 10 months ago
- aesthetic tensor visualiser☆24Updated 2 months ago
- Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorch☆25Updated 5 months ago
- ☆33Updated 5 months ago
- rl from zero pretrain, can it be done? we'll see.☆56Updated this week
- ☆23Updated 6 months ago
- ☆13Updated last month
- Project code for training LLMs to write better unit tests + code☆20Updated last month
- LLM training in simple, raw C/CUDA☆14Updated 6 months ago