NousResearch / StripedHyenaTrainer
☆60Updated last year
Alternatives and similar repositories for StripedHyenaTrainer:
Users that are interested in StripedHyenaTrainer are comparing it to the libraries listed below
- ☆22Updated last year
- ☆49Updated 10 months ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- ☆48Updated last year
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated last year
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- [WIP] Transformer to embed Danbooru labelsets☆13Updated 9 months ago
- Public Inflection Benchmarks☆69Updated 10 months ago
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆53Updated 3 months ago
- Full finetuning of large language models without large memory requirements☆93Updated last year
- ☆48Updated 2 months ago
- Score LLM pretraining data with classifiers☆54Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆91Updated 2 months ago
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆69Updated last year
- Experiments for efforts to train a new and improved t5☆77Updated 9 months ago
- A MAD laboratory to improve AI architecture designs 🧪☆102Updated last month
- ☆37Updated 6 months ago
- Simplex Random Feature attention, in PyTorch☆72Updated last year
- Implementation of the Llama architecture with RLHF + Q-learning☆157Updated last year
- A repository of projects and datasets under active development by Alignment Lab AI☆22Updated last year
- Code repository for the c-BTM paper☆105Updated last year
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated last year
- ☆41Updated last year
- ☆27Updated 6 months ago
- ☆92Updated last year
- QLoRA with Enhanced Multi GPU Support☆36Updated last year
- Latent Diffusion Language Models☆68Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆29Updated 4 months ago
- ☆78Updated 9 months ago