CalculatedContent / setol_paperLinks
SETOL: SemiEmpirical Theory of (Deep) Learning
☆19Updated this week
Alternatives and similar repositories for setol_paper
Users that are interested in setol_paper are comparing it to the libraries listed below
Sorting:
- ☆134Updated 2 months ago
- 🧱 Modula software package☆200Updated 2 months ago
- Official JAX implementation of xLSTM including fast and efficient training and inference code. 7B model available at https://huggingface.…☆94Updated 5 months ago
- An introduction to LLM Sampling☆78Updated 6 months ago
- Getting crystal-like representations with harmonic loss☆190Updated 2 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆139Updated this week
- Deep Learning, an Energy Approach☆139Updated 2 weeks ago
- Efficient optimizers☆220Updated last week
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆140Updated last month
- supporting pytorch FSDP for optimizers☆82Updated 6 months ago
- rl from zero pretrain, can it be done? we'll see.☆56Updated this week
- ☆190Updated 6 months ago
- A package for defining deep learning models using categorical algebraic expressions.☆61Updated 11 months ago
- Genetics for Language Models☆13Updated 11 months ago
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆87Updated 3 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆66Updated 2 months ago
- ☆114Updated 6 months ago
- ☆220Updated 3 weeks ago
- ☆65Updated last year
- Open source interpretability artefacts for R1.☆149Updated 2 months ago
- Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).☆202Updated 6 months ago
- PyTorch library for Active Fine-Tuning☆80Updated 4 months ago
- nanoGPT-like codebase for LLM training☆98Updated last month
- A simple MLX implementation for pretraining LLMs on Apple Silicon.☆80Updated last month
- The history files when recording human interaction while solving ARC tasks☆112Updated 2 weeks ago
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆63Updated 7 months ago
- ☆150Updated 10 months ago
- ☆98Updated 5 months ago
- An implementation of PSGD Kron second-order optimizer for PyTorch☆91Updated 2 months ago
- Attribution-based Parameter Decomposition☆25Updated 2 weeks ago