fkodom / python-repo-templateLinks
Template repo for Python projects, especially those focusing on machine learning and/or deep learning.
โ15Updated 2 months ago
Alternatives and similar repositories for python-repo-template
Users that are interested in python-repo-template are comparing it to the libraries listed below
Sorting:
- A MAD laboratory to improve AI architecture designs ๐งชโ123Updated 7 months ago
- Implementation of the Llama architecture with RLHF + Q-learningโ165Updated 5 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.โ149Updated 3 weeks ago
- Large scale 4D parallelism pre-training for ๐ค transformers in Mixture of Experts *(still work in progress)*โ85Updated last year
- Understand and test language model architectures on synthetic tasks.โ220Updated last week
- Annotated version of the Mamba paperโ487Updated last year
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Dayโ256Updated last year
- Prune transformer layersโ69Updated last year
- nanoGPT-like codebase for LLM trainingโ101Updated 2 months ago
- Collection of autoregressive model implementationโ86Updated 3 months ago
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"โ238Updated last month
- Load compute kernels from the Hubโ210Updated this week
- โ162Updated last year
- Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.โ50Updated last year
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.โ82Updated last year
- Implementation of Infini-Transformer in Pytorchโ111Updated 6 months ago
- โ166Updated 2 years ago
- FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Coresโ323Updated 6 months ago
- An extension of the nanoGPT repository for training small MOE models.โ163Updated 4 months ago
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPTโ215Updated 11 months ago
- โ81Updated last year
- The AdEMAMix Optimizer: Better, Faster, Older.โ183Updated 10 months ago
- gzip Predicts Data-dependent Scaling Lawsโ35Updated last year
- โ61Updated last year
- Code repository for Black Mambaโ250Updated last year
- โ113Updated last year
- A really tiny autograd engineโ94Updated last month
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 linesโ197Updated last year
- Implementation of ๐ป Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorchโ89Updated last year
- A repository for log-time feedforward networksโ222Updated last year