evintunador / FractalFormerLinks
A GPT with self-similar nested properties
☆20Updated last year
Alternatives and similar repositories for FractalFormer
Users that are interested in FractalFormer are comparing it to the libraries listed below
Sorting:
- look how they massacred my boy☆63Updated 8 months ago
- Cerule - A Tiny Mighty Vision Model☆66Updated 9 months ago
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆41Updated last year
- GPTs inside of GPTs like Russian nesting dolls☆9Updated last year
- entropix style sampling + GUI☆26Updated 7 months ago
- Video+code lecture on building nanoGPT from scratch☆68Updated last year
- ☆66Updated last year
- ☆115Updated 6 months ago
- GPT-2 small trained on phi-like data☆66Updated last year
- ☆133Updated 10 months ago
- 5X faster 60% less memory QLoRA finetuning☆21Updated last year
- An unsupervised model merging algorithm for Transformers-based language models.☆105Updated last year
- ☆27Updated last year
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆64Updated 7 months ago
- Modify Entropy Based Sampling to work with Mac Silicon via MLX☆50Updated 7 months ago
- The open-source implementation of Q*, achieved in context as a zero-shot reprogramming of the attention mechanism. (synthetic data)☆1Updated 6 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated last year
- Let's create synthetic textbooks together :)☆75Updated last year
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- Full finetuning of large language models without large memory requirements☆94Updated last year
- Collection of autoregressive model implementation☆85Updated 2 months ago
- Modeling code for a BitNet b1.58 Llama-style model.☆25Updated last year
- All the world is a play, we are but actors in it.☆50Updated this week
- The next evolution of Agents☆48Updated this week
- ☆20Updated last year
- ☆34Updated 3 months ago
- 1.58-bit LLaMa model☆81Updated last year
- Lego for GRPO☆28Updated last month
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆53Updated 4 months ago
- RAG Agent for the ARC AGI Challenge☆21Updated 11 months ago