nisten / grokadamw
new optimizer
☆19Updated 7 months ago
Alternatives and similar repositories for grokadamw:
Users that are interested in grokadamw are comparing it to the libraries listed below
- Using open source LLMs to build synthetic datasets for direct preference optimization☆59Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆39Updated last month
- ☆48Updated 4 months ago
- ☆38Updated last month
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated last year
- Simple GRPO scripts and configurations.☆59Updated last month
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated 10 months ago
- Lego for GRPO☆25Updated last week
- 🚀 Automatically convert unstructured data into a high-quality 'textbook' format, optimized for fine-tuning Large Language Models (LLMs)☆26Updated last year
- ☆52Updated 7 months ago
- Set of scripts to finetune LLMs☆37Updated last year
- ☆22Updated 9 months ago
- Latent Large Language Models☆17Updated 7 months ago
- ☆49Updated last year
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated 4 months ago
- ☆126Updated 7 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 8 months ago
- ☆48Updated last year
- ☆32Updated 9 months ago
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆42Updated 10 months ago
- ☆112Updated 6 months ago
- ☆35Updated last year
- Experiments for efforts to train a new and improved t5☆77Updated 11 months ago
- Fine-tunes a student LLM using teacher feedback for improved reasoning and answer quality. Implements GRPO with teacher-provided evaluati…☆39Updated 3 weeks ago
- Data preparation code for CrystalCoder 7B LLM☆44Updated 10 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- RWKV-7: Surpassing GPT☆82Updated 4 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 7 months ago
- ☆57Updated 6 months ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆41Updated last year