swiss-ai / Megatron-LMLinks

Ongoing research training transformer models at scale

☆40

Alternatives and similar repositories for Megatron-LM

Users that are interested in Megatron-LM are comparing it to the libraries listed below

Sorting:

arcee-ai / DAM
☆55Updated last year
AnswerDotAI / ModernBERT-Instruct-mini-cookbook
☆53Updated 10 months ago
louisbrulenaudet / ragoon
High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡
☆68Updated 3 weeks ago
allenai / olmo-cookbook
OLMost every training recipe you need to perform data interventions with the OLMo family of models.
☆57Updated this week
nexusflowai / NexusBench
Nexusflow function call, tool use, and agent benchmarks.
☆30Updated 11 months ago
miralab-ai / autoreason
☆40Updated 11 months ago
LLM360 / amber-data-prep
Data preparation code for Amber 7B LLM
☆93Updated last year
LLM360 / crystalcoder-data-prep
Data preparation code for CrystalCoder 7B LLM
☆45Updated last year
ibm-granite / granite-embedding-models
☆48Updated last month
pgasawa / BARE
Leveraging Base Language Models for Few-Shot Synthetic Data Generation
☆38Updated last month
huggingface / huggingface-inference-toolkit
Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.
☆88Updated 3 weeks ago
HazyResearch / aioli
Aioli: A unified optimization framework for language model data mixing
☆31Updated 10 months ago
QuixiAI / kraken
☆68Updated last year
reka-ai / rekaquant
☆62Updated 4 months ago
QuixiAI / spectrum
☆138Updated 3 months ago
Pleias / Pleias-RAG-Library
Python library to use Pleias-RAG models
☆67Updated 7 months ago
penfever / wildchat-50m
Code, results and other artifacts from the paper introducing the WildChat-50m dataset and the Re-Wild model family.
☆31Updated 8 months ago
weaviate / structured-rag
Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models
☆115Updated 8 months ago
choosewhatulike / case2code
☆17Updated 8 months ago
samchaineau / llm_slerp_generation
Repo hosting codes and materials related to speeding LLMs' inference using token merging.
☆37Updated 2 months ago
LLM360 / Analysis360
Open Implementations of LLM Analyses
☆107Updated last year
davanstrien / haiku-dpo
Using open source LLMs to build synthetic datasets for direct preference optimization
☆70Updated last year
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆59Updated last month
tensorwavecloud / ScalarLM
ScalarLM - a unified training and inference stack
☆94Updated 3 weeks ago
bespokelabsai / verifiers
Verifiers for LLM Reinforcement Learning
☆80Updated 7 months ago
Pleias / OCRoscope
Small python package to measure OCR quality and other related metrics.
☆25Updated last year
EduardTalianu / EntropixLab
entropix style sampling + GUI
☆27Updated last year
goncalorafaria / qalign
QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.
☆26Updated 3 weeks ago
open-lm-engine / lm-engine
LM engine is a library for pretraining/finetuning LLMs
☆77Updated this week
IST-DASLab / gptq-gguf-toolkit
Efficient non-uniform quantization with GPTQ for GGUF
☆53Updated 2 months ago