swiss-ai / Megatron-LMLinks
Ongoing research training transformer models at scale
☆42Updated this week
Alternatives and similar repositories for Megatron-LM
Users that are interested in Megatron-LM are comparing it to the libraries listed below
Sorting:
- ☆55Updated last year
- ☆53Updated 11 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆69Updated 2 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆30Updated last year
- ☆17Updated 9 months ago
- Python library to use Pleias-RAG models☆67Updated 8 months ago
- ☆59Updated last year
- ☆39Updated last year
- Open Implementations of LLM Analyses☆107Updated last year
- LM engine is a library for pretraining/finetuning LLMs☆110Updated last week
- ☆62Updated 6 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆61Updated last year
- Source code for the collaborative reasoner research project at Meta FAIR.☆112Updated 9 months ago
- Data preparation code for CrystalCoder 7B LLM☆45Updated last year
- A toolkit implementing advanced methods to transfer models and model knowledge across tokenizers.☆61Updated 6 months ago
- FMS Model Optimizer is a framework for developing reduced precision neural network models.☆20Updated 2 weeks ago
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆106Updated 8 months ago
- ☆90Updated last month
- ☆51Updated 3 months ago
- Aioli: A unified optimization framework for language model data mixing☆32Updated last year
- ☆32Updated last year
- Train LLM on Hugging Face infra☆67Updated 2 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆72Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Updated 3 months ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆81Updated last year
- ☆138Updated 4 months ago
- 💻 SETA: Scaling Environments for Terminal Agents - Environments☆38Updated last week
- ☆48Updated last year
- EvaByte: Efficient Byte-level Language Models at Scale☆114Updated 8 months ago
- Simple examples using Argilla tools to build AI☆57Updated last year