SakanaAI / CycleQD
CycleQD is a framework for parameter space model merging.
☆27Updated last month
Alternatives and similar repositories for CycleQD:
Users that are interested in CycleQD are comparing it to the libraries listed below
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆55Updated 7 months ago
- ☆22Updated last year
- ☆15Updated 4 months ago
- ☆26Updated 7 months ago
- Checkpointable dataset utilities for foundation model training☆32Updated 11 months ago
- Ongoing Research Project for continaual pre-training LLM(dense mode)☆37Updated 3 weeks ago
- SDTT: a simple and effective distillation method for discrete diffusion models☆16Updated this week
- Mamba training library developed by kotoba technologies☆67Updated 11 months ago
- ☆14Updated 9 months ago
- Japanese LLaMa experiment☆52Updated last month
- Example of using Epochraft to train HuggingFace transformers models with PyTorch FSDP☆12Updated 11 months ago
- ☆55Updated 7 months ago
- ☆15Updated last month
- Jax/Flax implementation of Denoising Diffusion Implicit Models☆17Updated 2 years ago
- Ongoing research training Mixture of Expert models.☆19Updated 4 months ago
- ☆31Updated 9 months ago
- 最新LLMの一覧を作成します☆16Updated this week
- ☆33Updated 5 months ago
- minimal diffusion model for self-study☆19Updated last year
- Codes for the paper "A mathematical perspective on Transformers".☆34Updated 6 months ago
- Swallowプロジェクト 大規模言語モデル 評価スクリプト☆14Updated 6 months ago
- ☆12Updated last year
- Code for the "Cultural evolution in populations of Large Language Models" paper☆29Updated 2 months ago
- Support Continual pre-training & Instruction Tuning forked from llama-recipes☆31Updated 11 months ago
- ☆14Updated 4 months ago
- Docker for everyday deep learning research on a remote server. (Tensorflow & Pytorch / Jax + VNC)☆22Updated last month
- ☆12Updated 3 years ago
- LLaVA-JP is a Japanese VLM trained by LLaVA method☆59Updated 6 months ago
- ☆31Updated last month
- PyTorch Implementation of the paper "Towards Learning Abductive Reasoning using VSA Distributed Representations".☆14Updated last month