eniompw / nanoGPTshakespeare
finetuning shakespeare on karpathy/nanoGPT
☆18Updated 2 years ago
Alternatives and similar repositories for nanoGPTshakespeare:
Users that are interested in nanoGPTshakespeare are comparing it to the libraries listed below
- Fine-tune and quantize Llama-2-like models to generate Python code using QLoRA, Axolot,..☆64Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- Repository containing awesome resources regarding Hugging Face tooling.☆46Updated last year
- Collection of autoregressive model implementation☆83Updated last month
- This repository contains a better implementation of Kolmogorov-Arnold networks☆61Updated 10 months ago
- ☆60Updated last year
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- Testing KAN-based text generation GPT models☆16Updated 10 months ago
- inference code for mixtral-8x7b-32kseqlen☆99Updated last year
- A library for simplifying fine tuning with multi gpu setups in the Huggingface ecosystem.☆16Updated 5 months ago
- FMS Model Optimizer is a framework for developing reduced precision neural network models.☆16Updated this week
- Exploration into the Firefly algorithm in Pytorch☆35Updated last month
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆69Updated last year
- ☆27Updated 8 months ago
- Eh, simple and works.☆27Updated last year
- alternative way to calculating self attention☆18Updated 10 months ago
- Training and Fine-tuning an llm in Python and PyTorch.☆41Updated last year
- Building large language foundational model☆9Updated 3 weeks ago
- QLoRA with Enhanced Multi GPU Support☆36Updated last year
- Jax like function transformation engine but micro, microjax☆30Updated 5 months ago
- ☆26Updated 3 weeks ago
- Rust bindings for CTranslate2☆14Updated last year
- 👷 Build compute kernels☆24Updated this week
- Finetuning BLOOM on a single GPU using gradient-accumulation☆28Updated 2 years ago
- ☆31Updated 9 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated 10 months ago
- minimal LLM scripts for 24GB VRAM GPUs. training, inference, whatever☆38Updated last week
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆98Updated 3 months ago
- Mixtral finetuning☆19Updated last year
- ☆23Updated this week