babylm / baseline-pretrainingLinks

Code for pre-training BabyLM baseline models.

☆15

Alternatives and similar repositories for baseline-pretraining

Users that are interested in baseline-pretraining are comparing it to the libraries listed below

Sorting:

kotoba-tech / kotomamba
Mamba training library developed by kotoba technologies
☆71Updated last year
okoge-kaz / moe-recipes
Ongoing research training Mixture of Expert models.
☆19Updated 10 months ago
kotoba-tech / kotoba-recipes
Support Continual pre-training & Instruction Tuning forked from llama-recipes
☆32Updated last year
Aratako / Task-Vector-Merge-Optimzier
☆14Updated last year
llm-jp / llm-jp-sft
☆61Updated last year
nlp-waseda / JMMLU
日本語マルチタスク言語理解ベンチマーク Japanese Massive Multitask Language Understanding Benchmark
☆36Updated 7 months ago
yuzu-ai / japanese-llm-ranking
☆49Updated last year
SakanaAI / CycleQD
CycleQD is a framework for parameter space model merging.
☆40Updated 5 months ago
hitachi-nlp / FLD-corpus
☆17Updated 7 months ago
iwiwi / epochraft
Checkpointable dataset utilities for foundation model training
☆32Updated last year
okoge-kaz / llm-recipes
Ongoing Research Project for continaual pre-training LLM(dense mode)
☆42Updated 4 months ago
lighttransport / japanese-llama-experiment
Japanese LLaMa experiment
☆53Updated 7 months ago
swallow-llm / swallow-evaluation
Swallowプロジェクト大規模言語モデル評価スクリプト
☆19Updated 3 months ago
ku-nlp / ja-vicuna-qa-benchmark
☆33Updated 11 months ago
luchris429 / DiscoPOP
Code for Discovering Preference Optimization Algorithms with and for Large Language Models
☆63Updated last year
iwiwi / epochraft-hf-fsdp
Example of using Epochraft to train HuggingFace transformers models with PyTorch FSDP
☆11Updated last year
hitachi-nlp / FLD
☆53Updated 7 months ago
ryokamoi / llm-self-correction-papers
List of papers on Self-Correction of LLMs.
☆73Updated 6 months ago
SakanaAI / TAID
Official implementation of "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models"
☆110Updated 5 months ago
leia-llm / leia
LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation
☆21Updated last year
hppRC / llm-translator
Mixtral-based Ja-En (En-Ja) Translation model
☆19Updated 6 months ago
turingmotors / vlm-recipes
☆16Updated 10 months ago
sbintuitions / flexeval
Flexible evaluation tool for language models
☆49Updated last week
misonuma / untrac
Code for "Unlearning Traces the Influential Training Data of Language Models"
☆12Updated last year
HojiChar / HojiChar
The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.
☆122Updated this week
hadasah / btm
☆74Updated last year
jdeschena / sdtt
[ICLR 2025] SDTT: a simple and effective distillation method for discrete diffusion models
☆29Updated 3 months ago
allenai / aries
Aligned, Review-Informed Edits of Scientific Papers
☆53Updated 2 years ago
Ino-Ichan / GIT-LLM
☆22Updated last year
wandb / llm-leaderboard
Project of llm evaluation to Japanese tasks
☆84Updated last week