NousResearch / StripedHyenaTrainer
☆61Updated last year
Alternatives and similar repositories for StripedHyenaTrainer
Users that are interested in StripedHyenaTrainer are comparing it to the libraries listed below
Sorting:
- ☆22Updated last year
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- inference code for mixtral-8x7b-32kseqlen☆100Updated last year
- ☆48Updated last year
- Simplex Random Feature attention, in PyTorch☆74Updated last year
- ☆49Updated last year
- ☆27Updated 10 months ago
- Full finetuning of large language models without large memory requirements☆94Updated last year
- ☆81Updated last year
- gzip Predicts Data-dependent Scaling Laws☆35Updated 11 months ago
- Public Inflection Benchmarks☆68Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆53Updated 3 months ago
- Collection of autoregressive model implementation☆85Updated 3 weeks ago
- Train your own SOTA deductive reasoning model☆92Updated 2 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆98Updated 2 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- [WIP] Transformer to embed Danbooru labelsets☆13Updated last year
- some common Huggingface transformers in maximal update parametrization (µP)☆80Updated 3 years ago
- ☆48Updated 6 months ago
- An introduction to LLM Sampling☆78Updated 5 months ago
- QLoRA with Enhanced Multi GPU Support☆37Updated last year
- Code repository for the c-BTM paper☆106Updated last year
- Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…☆60Updated 6 months ago
- Implementation of the Llama architecture with RLHF + Q-learning☆164Updated 3 months ago
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆32Updated 2 months ago
- ☆21Updated 6 months ago
- Small and Efficient Mathematical Reasoning LLMs☆71Updated last year
- ☆33Updated 10 months ago
- Experiments for efforts to train a new and improved t5☆77Updated last year
- Ongoing research training transformer models at scale☆37Updated last year