tensoic / Cerule
Cerule - A Tiny Mighty Vision Model
☆67Updated 7 months ago
Alternatives and similar repositories for Cerule:
Users that are interested in Cerule are comparing it to the libraries listed below
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆93Updated last month
- Collection of autoregressive model implementation☆85Updated last month
- look how they massacred my boy☆63Updated 5 months ago
- Video+code lecture on building nanoGPT from scratch☆66Updated 9 months ago
- Full finetuning of large language models without large memory requirements☆93Updated last year
- ☆63Updated 6 months ago
- ☆49Updated last year
- ☆112Updated 3 months ago
- Set of scripts to finetune LLMs☆37Updated last year
- ☆38Updated 8 months ago
- ☆15Updated last year
- ☆50Updated last year
- ☆129Updated 7 months ago
- Focused on fast experimentation and simplicity☆71Updated 3 months ago
- ☆66Updated 10 months ago
- [WIP] Transformer to embed Danbooru labelsets☆13Updated last year
- Simple GRPO scripts and configurations.☆58Updated 2 months ago
- entropix style sampling + GUI☆25Updated 5 months ago
- realtime latent world model inference demo☆44Updated 5 months ago
- Fast approximate inference on a single GPU with sparsity aware offloading☆38Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆39Updated 2 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆31Updated last month
- Lego for GRPO☆26Updated last week
- ☆48Updated last year
- An introduction to LLM Sampling☆77Updated 3 months ago
- inference code for mixtral-8x7b-32kseqlen☆99Updated last year
- All the world is a play, we are but actors in it.☆49Updated this week
- ☆97Updated 6 months ago
- Lightweight package that tracks and summarizes code changes using LLMs (Large Language Models)☆32Updated last month