CalculatedContent / setol_paperLinks

SETOL: SemiEmpirical Theory of (Deep) Learning

☆19

Alternatives and similar repositories for setol_paper

Users that are interested in setol_paper are comparing it to the libraries listed below

Sorting:

google-deepmind / mishax
☆134Updated 2 months ago
modula-systems / modula
🧱 Modula software package
☆200Updated 2 months ago
NX-AI / xlstm-jax
Official JAX implementation of xLSTM including fast and efficient training and inference code. 7B model available at https://huggingface.…
☆94Updated 5 months ago
Pleias / Quest-Best-Tokens
An introduction to LLM Sampling
☆78Updated 6 months ago
KindXiaoming / grow-crystals
Getting crystal-like representations with harmonic loss
☆190Updated 2 months ago
EleutherAI / nanoGPT-mup
The simplest, fastest repository for training/finetuning medium-sized GPTs.
☆139Updated this week
Atcold / Energy-Book
Deep Learning, an Energy Approach
☆139Updated 2 weeks ago
HomebrewML / HeavyBall
Efficient optimizers
☆220Updated last week
apoorvkh / academic-pretraining
$100K or 100 Days: Trade-offs when Pre-Training with Academic Resources
☆140Updated last month
ethansmith2000 / fsdp_optimizers
supporting pytorch FSDP for optimizers
☆82Updated 6 months ago
tokenbender / avataRL
rl from zero pretrain, can it be done? we'll see.
☆56Updated this week
nikhilvyas / SOAP
☆190Updated 6 months ago
vtabbott / Algebraic-NCD
A package for defining deep learning models using categorical algebraic expressions.
☆61Updated 11 months ago
Nicolas-Yax / PhyloLM
Genetics for Language Models
☆13Updated 11 months ago
clement-bonnet / lpn
Latent Program Network (from the "Searching Latent Program Spaces" paper)
☆87Updated 3 months ago
VatsaDev / NanoPoor
NanoGPT-speedrunning for the poor T4 enjoyers
☆66Updated 2 months ago
jerber / lang-jepa
☆114Updated 6 months ago
google-research / optformer
☆220Updated 3 weeks ago
eemlcommunity / PracticalSessions2023
☆65Updated last year
goodfire-ai / r1-interpretability
Open source interpretability artefacts for R1.
☆149Updated 2 months ago
callummcdougall / sae_vis
Create feature-centric and prompt-centric visualizations for sparse autoencoders (like those from Anthropic's published research).
☆202Updated 6 months ago
jonhue / activeft
PyTorch library for Active Fine-Tuning
☆80Updated 4 months ago
epfml / llm-baselines
nanoGPT-like codebase for LLM training
☆98Updated last month
N8python / mlx-pretrain
A simple MLX implementation for pretraining LLMs on Apple Silicon.
☆80Updated last month
neoneye / ARC-Interactive-History-Dataset
The history files when recording human interaction while solving ARC tasks
☆112Updated 2 weeks ago
Aleph-Alpha-Research / scaling
Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…
☆63Updated 7 months ago
BlackHC / neural_net_checklist
☆150Updated 10 months ago
LucasPrietoAl / grokking-at-the-edge-of-numerical-stability
☆98Updated 5 months ago
evanatyourservice / kron_torch
An implementation of PSGD Kron second-order optimizer for PyTorch
☆91Updated 2 months ago
ApolloResearch / apd
Attribution-based Parameter Decomposition
☆25Updated 2 weeks ago