IBM / dolomite-engine

Dolomite Engine is a library for pretraining/finetuning LLMs

☆52

Alternatives and similar repositories for dolomite-engine:

Users that are interested in dolomite-engine are comparing it to the libraries listed below

google-deepmind / asyncdiloco
☆43Updated last year
siyan-zhao / prepacking
The source code of our work "Prepacking: A Simple Method for Fast Prefilling and Increased Throughput in Large Language Models" [AISTATS …
☆59Updated 6 months ago
IBM / ensemble-instruct
codebase release for EMNLP2023 paper publication
☆19Updated last year
TRI-ML / linear_open_lm
A repository for research on medium sized language models.
☆76Updated 11 months ago
EleutherAI / improved-t5
Experiments for efforts to train a new and improved t5
☆77Updated last year
ContextualAI / CLAIR_and_APO
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆55Updated 8 months ago
IBM / text-generation-inference
IBM development fork of https://github.com/huggingface/text-generation-inference
☆60Updated 4 months ago
hamishivi / EasyLM
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…
☆72Updated 8 months ago
mayank31398 / ladder-residual-inference
☆13Updated last month
ServiceNow / PipelineRL
A scalable asynchronous reinforcement learning implementation with in-flight weight updates.
☆100Updated this week
foundation-model-stack / bamba
Train, tune, and infer Bamba model
☆115Updated last week
arcee-ai / DAM
☆48Updated 6 months ago
Edward-Sun / gpt-accelera
Simple and efficient pytorch-native transformer training and inference (batched)
☆75Updated last year
kyegomez / Infini-attention
Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…
☆55Updated 2 weeks ago
allenai / bff
☆38Updated last year
srush / LLM-Talk
☆49Updated last year
microsoft / mutransformers
some common Huggingface transformers in maximal update parametrization (µP)
☆80Updated 3 years ago
liujch1998 / infini-gram
☆37Updated this week
IST-DASLab / SparseFinetuning
Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry
☆40Updated last year
HazyResearch / train-tk
train with kittens!
☆57Updated 6 months ago
Aleph-Alpha / trigrams
☆54Updated 8 months ago
PiotrNawrot / nano-sparse-attention
The simplest implementation of recent Sparse Attention patterns for efficient LLM inference.
☆60Updated 3 months ago
shreyansh26 / Attention-Mask-Patterns
Using FlexAttention to compute attention with different masking patterns
☆43Updated 7 months ago
jeffreysijuntan / lloco
The official repo for "LLoCo: Learning Long Contexts Offline"
☆116Updated 10 months ago
epfml / llm-baselines
nanoGPT-like codebase for LLM training
☆94Updated last month
JHU-CLSP / RATIONALYST
Code for RATIONALYST: Pre-training Process-Supervision for Improving Reasoning https://arxiv.org/pdf/2410.01044
☆32Updated 7 months ago
ZeroSumEval / ZeroSumEval
A framework for pitting LLMs against each other in an evolving library of games ⚔
☆32Updated 3 weeks ago
kernelmachine / cbtm
Code repository for the c-BTM paper
☆106Updated last year
LLM360 / crystalcoder-train
Pre-training code for CrystalCoder 7B LLM
☆54Updated 11 months ago
snowflakedb / ArcticInference
☆45Updated last week