LLM360/amber-train

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LLM360/amber-train)

LLM360 / amber-train

Pre-training code for Amber 7B LLM

☆175

Alternatives and similar repositories for amber-train

Users that are interested in amber-train are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

LLM360 / crystalcoder-train
View on GitHub
Pre-training code for CrystalCoder 7B LLM
☆59May 10, 2024Updated 2 years ago
LLM360 / amber-data-prep
View on GitHub
Data preparation code for Amber 7B LLM
☆96May 10, 2024Updated 2 years ago
LLM360 / crystalcoder-data-prep
View on GitHub
Data preparation code for CrystalCoder 7B LLM
☆45May 10, 2024Updated 2 years ago
LLM360 / Analysis360
View on GitHub
Open Implementations of LLM Analyses
☆110Oct 8, 2024Updated last year
premAI-io / serverless-examples
View on GitHub
🚀 End-to-end examples and analysis of deploying LLMs serverless using Modal, Runpod, and Beam
☆28Mar 25, 2024Updated 2 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
mbzuai-oryx / MobiLlama
View on GitHub
[ICLR-2025-SLLM Spotlight 🔥]MobiLlama : Small Language Model tailored for edge devices
☆667May 10, 2025Updated last year
LLM360 / k2-train
View on GitHub
☆58Jun 6, 2024Updated 2 years ago
SprocketLab / Alchemist
View on GitHub
☆12Mar 4, 2025Updated last year
premAI-io / cookbook
View on GitHub
Explore different generative AI usecases and examples with Prem AI cookbook
☆25Jul 2, 2024Updated 2 years ago
sail-sg / SimLayerKV
View on GitHub
The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.
☆54Oct 18, 2024Updated last year
LegallyCoder / mamba-hf
View on GitHub
Implementation of the Mamba SSM with hf_integration.
☆55Aug 31, 2024Updated last year
kimyuji / EvolvingQA_benchmark
View on GitHub
Code and Dataset release of "Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models" (NAACL 2024)
☆10Oct 16, 2024Updated last year
SalesforceAIResearch / MobileAIBench
View on GitHub
☆26Jun 2, 2026Updated last month
allenai / OLMo
View on GitHub
Modeling, training, eval, and inference code for OLMo
☆6,600Nov 24, 2025Updated 7 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
joeljang / FLM
View on GitHub
All-in-one repository for Fine-tuning & Pretraining (Large) Language Models
☆15Mar 8, 2023Updated 3 years ago
huggingface / llm-swarm
View on GitHub
Manage scalable open LLM inference endpoints in Slurm clusters
☆289Jul 11, 2024Updated 2 years ago
IlyasMoutawwakil / py-txi
View on GitHub
A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
☆32Sep 19, 2025Updated 10 months ago
RAIVNLab / MatFormer-OLMo
View on GitHub
Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…
☆31Nov 14, 2023Updated 2 years ago
pkunlp-icler / MLS
View on GitHub
Source code of our paper "Focus on the Target’s Vocabulary: Masked Label Smoothing for Machine Translation" @ ACL 2022
☆13Apr 13, 2022Updated 4 years ago
ServiceNow / promptmix-emnlp-2023
View on GitHub
Offical code repository for PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation, EMNLP 2023
☆12Dec 13, 2023Updated 2 years ago
kaistAI / Janus
View on GitHub
[NeurIPS 2024] Train LLMs with diverse system messages reflecting individualized preferences to generalize to unseen system messages
☆53Aug 10, 2025Updated 11 months ago
jzhang38 / TinyLlama
View on GitHub
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
☆9,017May 3, 2024Updated 2 years ago
premAI-io / Ayup
View on GitHub
Quickly and securely turn any Linux box into a build and deployment assistant
☆25Oct 3, 2024Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
YuchuanTian / RethinkTinyLM
View on GitHub
[ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”
☆126Jan 14, 2025Updated last year
EleutherAI / pythia
View on GitHub
The hub for EleutherAI's work on interpretability and learning dynamics
☆2,861Nov 15, 2025Updated 8 months ago
allenai / dolma
View on GitHub
Data and tools for generating and inspecting OLMo pre-training data.
☆1,526Nov 5, 2025Updated 8 months ago
EleutherAI / semantic-memorization
View on GitHub
☆44Nov 17, 2024Updated last year
kaistAI / GAP
View on GitHub
[ACL 2023] Gradient Ascent Post-training Enhances Language Model Generalization
☆29Sep 12, 2024Updated last year
huggingface / nanotron
View on GitHub
Minimalistic large language model 3D-parallelism training
☆2,761May 26, 2026Updated last month
myshell-ai / JetMoE
View on GitHub
Reaching LLaMA2 Performance with 0.1M Dollars
☆985Jul 23, 2024Updated last year
kaistAI / InstructIR
View on GitHub
IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…
☆32Jun 13, 2024Updated 2 years ago
RL10x / RetNet
View on GitHub
an implementation of paper"Retentive Network: A Successor to Transformer for Large Language Models" https://arxiv.org/pdf/2307.08621.pdf
☆11Jul 25, 2023Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
kttian / llm_factuality_tuning
View on GitHub
☆40May 2, 2024Updated 2 years ago
SprocketLab / roboshot
View on GitHub
☆24May 30, 2024Updated 2 years ago
SqueezeAILab / LLM2LLM
View on GitHub
[ACL 2024] LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement
☆196Mar 25, 2024Updated 2 years ago
EleutherAI / pilev2
View on GitHub
☆13Jan 20, 2023Updated 3 years ago
chenllliang / ParetoMNMT
View on GitHub
Source code for paper "On the Pareto Front of Multilingual Neural Machine Translation" @ NeurIPS 2023
☆17Sep 27, 2023Updated 2 years ago
yale-nlp / refdpo
View on GitHub
☆16Jul 23, 2024Updated 2 years ago
TencentARC / LLaMA-Pro
View on GitHub
[ACL 2024] Progressive LLaMA with Block Expansion.
☆513May 20, 2024Updated 2 years ago