Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models
☆70Aug 27, 2023Updated 2 years ago
Alternatives and similar repositories for zeus-llm-trainer
Users that are interested in zeus-llm-trainer are comparing it to the libraries listed below
Sorting:
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆36Jul 6, 2023Updated 2 years ago
- Modified Stanford-Alpaca Trainer for Training Replit's Code Model☆42Jun 1, 2023Updated 2 years ago
- Token-level adaptation of LoRA matrices for downstream task generalization.☆15Apr 14, 2024Updated last year
- A framework for few-shot evaluation of autoregressive language models.☆16Aug 23, 2023Updated 2 years ago
- ☆22Aug 27, 2023Updated 2 years ago
- LLM Building Blocks for Python Course☆16Nov 17, 2025Updated 3 months ago
- QLoRA: Efficient Finetuning of Quantized LLMs☆11Jun 1, 2023Updated 2 years ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Jun 21, 2023Updated 2 years ago
- A repository of prompts and Python scripts for intelligent transformation of raw text into diverse formats.☆31May 29, 2023Updated 2 years ago
- Public reports detailing responses to sets of prompts by Large Language Models.☆32Jan 4, 2025Updated last year
- ReBase: Training Task Experts through Retrieval Based Distillation☆29Feb 5, 2025Updated last year
- [WIP] Transformer to embed Danbooru labelsets☆13Mar 31, 2024Updated last year
- An unofficial implementation of SOLAR-10.7B model and the newly proposed interlocked-DUS(iDUS) implementation and experiment details.☆14Mar 20, 2024Updated last year
- Set of scripts to finetune LLMs☆38Mar 30, 2024Updated last year
- Multipack distributed sampler for fast padding-free training of LLMs☆206Aug 10, 2024Updated last year
- An introduction to DSPy☆34Aug 30, 2025Updated 6 months ago
- This project is a Drake Hotline Bling meme generator using GPT-4 and Streamlit. The generator takes a user's input and generates a meme w…☆16May 2, 2023Updated 2 years ago
- Qwen3-0.6B megakernel: 527 tok/s decode on RTX 3090 (3.8x faster than PyTorch)☆81Feb 10, 2026Updated last month
- A Data Source for Reasoning Embodied Agents☆19Sep 18, 2023Updated 2 years ago
- 模型可视化工具netron的Flask版本☆19Jul 20, 2022Updated 3 years ago
- Layout Analysis Dataset with Segmonto (LADaS)☆24Jul 12, 2025Updated 7 months ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆21Feb 7, 2023Updated 3 years ago
- [ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization☆713Aug 13, 2024Updated last year
- CNN ensemble for prostate cancer Gleason grading☆19Jan 28, 2026Updated last month
- Developer showcase of projects built on Cartesia☆20Aug 28, 2024Updated last year
- Make triton easier☆50Jun 12, 2024Updated last year
- Approximating the joint distribution of language models via MCTS☆22Nov 3, 2024Updated last year
- ☆19Apr 4, 2023Updated 2 years ago
- Simple Model Similarities Analysis☆21Feb 3, 2024Updated 2 years ago
- Instruct-tune Open LLaMA / RedPajama / StableLM models on consumer hardware using QLoRA☆81Dec 15, 2023Updated 2 years ago
- ☆415Nov 2, 2023Updated 2 years ago
- Utilities for efficient fine-tuning, inference and evaluation of code generation models☆21Oct 3, 2023Updated 2 years ago
- QLoRA for Masked Language Modeling☆23Sep 11, 2023Updated 2 years ago
- Image Diffusion block merging technique applied to transformers based Language Models.☆56May 8, 2023Updated 2 years ago
- Unofficial implementation of AlpaGasus☆95Sep 23, 2023Updated 2 years ago
- Code for the paper: CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models☆31Apr 1, 2025Updated 11 months ago
- Amazon SageMaker で MLOps (前処理・学習・評価・推論、および、実験・モデル・ワークフローの管理) を実現するミニマムなコードサンプル☆24Apr 26, 2022Updated 3 years ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆23Mar 12, 2024Updated last year
- Projects developed by Domino's R&D team☆77Apr 14, 2022Updated 3 years ago