NousResearch / StripedHyenaTrainerLinks

☆61

Alternatives and similar repositories for StripedHyenaTrainer

Users that are interested in StripedHyenaTrainer are comparing it to the libraries listed below

Sorting:

epfml / DenseFormer
☆81Updated last year
lucidrains / llama-qrlhf
Implementation of the Llama architecture with RLHF + Q-learning
☆167Updated 8 months ago
KaiNylund / lm-weights-encode-time
☆69Updated last year
euclaise / supertrainer2000
☆50Updated last year
joey00072 / ohara
Collection of autoregressive model implementation
☆86Updated 6 months ago
EleutherAI / improved-t5
Experiments for efforts to train a new and improved t5
☆75Updated last year
CERC-AAI / Robin
☆63Updated last year
notarussianteenager / srf-attention
Simplex Random Feature attention, in PyTorch
☆73Updated 2 years ago
minosvasilias / simple_grpo
Simple GRPO scripts and configurations.
☆59Updated 8 months ago
automix-llm / automix
Mixing Language Models with Self-Verification and Meta-Verification
☆109Updated 10 months ago
Zyphra / Zyda_processing
☆39Updated last year
teknium1 / transformers-gptq-quant
☆46Updated 2 years ago
teknium1 / LLM-Benchmark-Logs
Just a bunch of benchmark logs for different LLMs
☆118Updated last year
huggingface / peft-pytorch-conference
Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…
☆14Updated 2 years ago
LLM360 / amber-data-prep
Data preparation code for Amber 7B LLM
☆93Updated last year
joshuacnf / Ctrl-G
☆102Updated 9 months ago
arcee-ai / DAM
☆55Updated 11 months ago
CarperAI / treasure_trove
☆22Updated 2 years ago
InflectionAI / Inflection-Benchmarks
Public Inflection Benchmarks
☆68Updated last year
Aleph-Alpha-Research / scaling
Scaling is a distributed training library and installable dependency designed to scale up neural networks, with a dedicated module for tr…
☆64Updated 3 weeks ago
Alignment-Lab-AI / datagen
a pipeline for using api calls to agnostically convert unstructured data into structured training data
☆31Updated last year
geronimi73 / phi2-finetune
☆88Updated last year
xjdr-alt / muzero_sketch
☆40Updated last year
official-elinas / zeus-llm-trainer
Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models
☆69Updated 2 years ago
idiap / sigma-gpt
σ-GPT: A New Approach to Autoregressive Models
☆68Updated last year
euclaise / SlimTrainer
Full finetuning of large language models without large memory requirements
☆93Updated last month
sdan / selfextend
an implementation of Self-Extend, to expand the context window via grouped attention
☆118Updated last year
microsoft / mutransformers
some common Huggingface transformers in maximal update parametrization (µP)
☆86Updated 3 years ago
vikhyat / mixtral-inference
inference code for mixtral-8x7b-32kseqlen
☆102Updated last year
SebastianBodza / EnsembleForecasting
Using multiple LLMs for ensemble Forecasting
☆16Updated last year