euclaise / supertrainer2000Links
☆49Updated last year
Alternatives and similar repositories for supertrainer2000
Users that are interested in supertrainer2000 are comparing it to the libraries listed below
Sorting:
- Collection of autoregressive model implementation☆85Updated last month
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated 2 years ago
- ☆53Updated last year
- ☆78Updated 11 months ago
- [WIP] Transformer to embed Danbooru labelsets☆13Updated last year
- QLoRA with Enhanced Multi GPU Support☆37Updated last year
- ☆22Updated last year
- ☆34Updated last year
- ☆81Updated last year
- Full finetuning of large language models without large memory requirements☆94Updated last year
- GoldFinch and other hybrid transformer components☆45Updated 11 months ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated last year
- Fast, Modern, Memory Efficient, and Low Precision PyTorch Optimizers☆94Updated this week
- ☆20Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated 2 years ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆101Updated 3 months ago
- ☆63Updated 8 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆66Updated 2 months ago
- supporting pytorch FSDP for optimizers☆82Updated 6 months ago
- Experiments for efforts to train a new and improved t5☆77Updated last year
- Latent Large Language Models☆18Updated 9 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆17Updated 3 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- https://x.com/BlinkDL_AI/status/1884768989743882276☆28Updated last month
- An introduction to LLM Sampling☆78Updated 6 months ago
- ☆19Updated last month
- Implementation of the Mamba SSM with hf_integration.☆56Updated 9 months ago
- A repository for research on medium sized language models.☆76Updated last year
- This repo is based on https://github.com/jiaweizzhao/GaLore☆28Updated 9 months ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆78Updated last month