okoge-kaz / moe-recipesLinks

Ongoing research training Mixture of Expert models.

☆21

Alternatives and similar repositories for moe-recipes

Users that are interested in moe-recipes are comparing it to the libraries listed below

Sorting:

swallow-llm / swallow-evaluation-instruct
Swallowプロジェクト事後学習済み大規模言語モデル評価フレームワーク
☆23Updated last month
swallow-llm / swallow-evaluation
Swallowプロジェクト大規模言語モデル評価スクリプト
☆22Updated 2 months ago
hppRC / llm-translator
Mixtral-based Ja-En (En-Ja) Translation model
☆20Updated 10 months ago
llm-jp / llm-jp-sft
☆61Updated last year
lighttransport / japanese-llama-experiment
Japanese LLaMa experiment
☆54Updated last month
kotoba-tech / kotoba-recipes
Support Continual pre-training & Instruction Tuning forked from llama-recipes
☆33Updated last year
llm-jp / llm-jp-corpus
☆43Updated last year
kotoba-tech / kotomamba
Mamba training library developed by kotoba technologies
☆70Updated last year
Ino-Ichan / GIT-LLM
☆22Updated 2 years ago
ce-lery / japanese-mistral-300m-recipe
☆17Updated 2 months ago
SakanaAI / TAID
Official implementation of "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models"
☆119Updated last month
KanHatakeyama / synthetic-texts-by-llm
☆27Updated last year
leia-llm / leia
LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation
☆22Updated last year
neodyland / entropix
Unofficial entropix impl for Gemma2 and Llama and Qwen2 and Mistral
☆17Updated 10 months ago
yuzu-ai / japanese-llm-ranking
☆50Updated last year
llm-jp / text2dataset
Easily turn large English text datasets into Japanese text datasets using open LLMs.
☆24Updated 10 months ago
okoge-kaz / llm-recipes
Ongoing Research Project for continaual pre-training LLM(dense mode)
☆43Updated 8 months ago
HojiChar / HojiChar
The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.
☆123Updated 2 weeks ago
ku-nlp / ja-vicuna-qa-benchmark
☆33Updated last year
wandb / llm-leaderboard
Project of llm evaluation to Japanese tasks
☆90Updated last month
turingmotors / vlm-recipes
☆20Updated last year
Aratako / Task-Vector-Merge-Optimzier
☆16Updated last year
llm-jp / llm-jp-eval
☆140Updated 2 weeks ago
Aratako / Japanese-RP-Bench
☆14Updated last year
nlp-waseda / JMMLU
日本語マルチタスク言語理解ベンチマーク Japanese Massive Multitask Language Understanding Benchmark
☆38Updated last month
frodo821 / BitNet-Transformers
0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" i…
☆98Updated last year
sbintuitions / flexeval
Flexible evaluation tool for language models
☆54Updated this week
kunishou / do-not-answer-ja
☆24Updated last year
pfnet-research / pfgen-bench
Preferred Generation Benchmark
☆85Updated last month
matsuolab / ucllm_nedo_prod
☆55Updated last year