fkodom / python-repo-templateLinks
Template repo for Python projects, especially those focusing on machine learning and/or deep learning.
โ15Updated 3 weeks ago
Alternatives and similar repositories for python-repo-template
Users that are interested in python-repo-template are comparing it to the libraries listed below
Sorting:
- A MAD laboratory to improve AI architecture designs ๐งชโ137Updated last year
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Dayโ260Updated 2 years ago
- Large scale 4D parallelism pre-training for ๐ค transformers in Mixture of Experts *(still work in progress)*โ86Updated 2 years ago
- Implementation of the Llama architecture with RLHF + Q-learningโ170Updated last year
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"โ247Updated 8 months ago
- Understand and test language model architectures on synthetic tasks.โ252Updated 3 weeks ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.โ186Updated 2 weeks ago
- Annotated version of the Mamba paperโ495Updated last year
- nanoGPT-like codebase for LLM trainingโ113Updated 3 months ago
- โ167Updated 2 years ago
- FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Coresโ340Updated last year
- Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.โ53Updated last year
- some common Huggingface transformers in maximal update parametrization (ยตP)โ87Updated 3 years ago
- โ62Updated 2 years ago
- โ94Updated 2 years ago
- Implementation of ๐ป Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorchโ91Updated 2 years ago
- โ208Updated 3 weeks ago
- A set of Python scripts that makes your experience on TPU betterโ56Updated 4 months ago
- A repository for log-time feedforward networksโ224Updated last year
- JAX implementation of the Llama 2 modelโ216Updated 2 years ago
- โ92Updated last year
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPTโ224Updated last year
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog poโฆโ92Updated 2 years ago
- Collection of autoregressive model implementationโ85Updated 3 weeks ago
- Implementation of Infini-Transformer in Pytorchโ112Updated last year
- Evaluating the Mamba architecture on the Othello gameโ49Updated last year
- Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorchโ231Updated last year
- Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorchโ378Updated last year
- gzip Predicts Data-dependent Scaling Lawsโ34Updated last year
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)โ198Updated last year