fkodom / python-repo-template
Template repo for Python projects, especially those focusing on machine learning and/or deep learning.
โ12Updated 5 months ago
Alternatives and similar repositories for python-repo-template:
Users that are interested in python-repo-template are comparing it to the libraries listed below
- A MAD laboratory to improve AI architecture designs ๐งชโ102Updated 2 months ago
- Understand and test language model architectures on synthetic tasks.โ181Updated last month
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.โ82Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.โ95Updated 3 months ago
- Large scale 4D parallelism pre-training for ๐ค transformers in Mixture of Experts *(still work in progress)*โ81Updated last year
- Implementation of the Llama architecture with RLHF + Q-learningโ162Updated 2 weeks ago
- Multipack distributed sampler for fast padding-free training of LLMsโ184Updated 6 months ago
- โ60Updated last year
- โ164Updated last year
- Implementation of ๐ฅฅ Coconut, Chain of Continuous Thought, in Pytorchโ153Updated last month
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"โ220Updated 2 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clustersโ115Updated 2 months ago
- Collection of autoregressive model implementationโ81Updated this week
- Prune transformer layersโ67Updated 8 months ago
- The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.โ56Updated 3 weeks ago
- โ92Updated last year
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)โ184Updated 8 months ago
- โ78Updated 10 months ago
- A fast implementation of T5/UL2 in PyTorch using Flash Attentionโ82Updated 3 weeks ago
- Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.โ51Updated 10 months ago
- Exploring finetuning public checkpoints on filter 8K sequences on Pileโ115Updated last year
- โ90Updated 8 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT trainingโ122Updated 10 months ago
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPTโ206Updated 5 months ago
- some common Huggingface transformers in maximal update parametrization (ยตP)โ78Updated 2 years ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIMโ51Updated 10 months ago
- Implementation of GateLoop Transformer in Pytorch and Jaxโ87Updated 8 months ago
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"โ215Updated 2 weeks ago
- Some preliminary explorations of Mamba's context scaling.โ213Updated last year
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Dayโ255Updated last year