stanford-futuredata / Megatron-LMLinks

Ongoing research training transformer models at scale

☆38

Alternatives and similar repositories for Megatron-LM

Users that are interested in Megatron-LM are comparing it to the libraries listed below

Sorting:

teknium1 / transformers-gptq-quant
☆45Updated 2 years ago
teknium1 / LLM-Benchmark-Logs
Just a bunch of benchmark logs for different LLMs
☆119Updated last year
vikhyat / mixtral-inference
inference code for mixtral-8x7b-32kseqlen
☆102Updated last year
ai8hyf / OpenResearchAssistant
An automated tool for discovering insights from research papaer corpora
☆138Updated last year
interstellarninja / MeeseeksAI
A framework for orchestrating AI agents using a mermaid graph
☆77Updated last year
sdan / selfextend
an implementation of Self-Extend, to expand the context window via grouped attention
☆119Updated last year
teknium1 / ShareGPT-Builder
☆116Updated 11 months ago
knowrohit / know_medical_dialogues
KMD is a collection of conversational exchanges between patients and doctors on various medical topics. It aims to capture the intricaci…
☆24Updated 2 years ago
xjdr-alt / llmri
look how they massacred my boy
☆63Updated last year
AK391 / dailypapersHN
☆86Updated last year
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆58Updated last month
bipul1010 / agents_tutorial
☆19Updated last year
JeezAI / DSPy_matchmaking
A seamless matchmaking application that is programmed with Cohere Command R+, Stanford NLP DSPy framework, Weaviate Vector store and Crew…
☆59Updated last year
tensoic / Cerule
Cerule - A Tiny Mighty Vision Model
☆67Updated last week
yoheinakajima / autofinetune
auto fine tune of models with synthetic data
☆76Updated last year
BBischof / yapping
Verbosity control for AI agents
☆64Updated last year
geronimi73 / qlora-minimal
☆88Updated 2 years ago
QuixiAI / kraken
☆67Updated last year
BerriAI / instructprompt
☆107Updated 2 years ago
abacaj / openhermes-function-calling
☆135Updated last year
Alignment-Lab-AI / KnowledgeBase
never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…
☆38Updated last year
swyxio / openlangmem
☆47Updated last year
NousResearch / StripedHyenaTrainer
☆62Updated last year
andrew-silva / mlx-rlhf
An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.
☆33Updated last year
Alignment-Lab-AI / datagen
a pipeline for using api calls to agnostically convert unstructured data into structured training data
☆31Updated last year
Xalp / ECHO
Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)
☆91Updated 9 months ago
QuixiAI / generate
☆28Updated last year
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆107Updated 8 months ago
SohamGovande / podplex
🦾💻🌐 distributed training & serverless inference at scale on RunPod
☆18Updated last year
enjalot / latent-data-modal
Using modal.com to process FineWeb-edu data
☆20Updated 7 months ago