swiss-ai / Megatron-LMLinks
Ongoing research training transformer models at scale
☆30Updated this week
Alternatives and similar repositories for Megatron-LM
Users that are interested in Megatron-LM are comparing it to the libraries listed below
Sorting:
- ☆54Updated 10 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆67Updated 10 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆29Updated 9 months ago
- ☆49Updated 7 months ago
- ☆40Updated 9 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆65Updated last year
- Python library to use Pleias-RAG models☆62Updated 4 months ago
- Data preparation code for CrystalCoder 7B LLM☆45Updated last year
- Open Implementations of LLM Analyses☆106Updated 11 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last month
- Pre-training code for CrystalCoder 7B LLM☆55Updated last year
- GPT-4 Level Conversational QA Trained In a Few Hours☆64Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆79Updated 11 months ago
- Verifiers for LLM Reinforcement Learning☆72Updated 5 months ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆92Updated 7 months ago
- ☆67Updated last year
- ☆57Updated 11 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆111Updated 5 months ago
- ☆51Updated last year
- entropix style sampling + GUI☆27Updated 10 months ago
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆27Updated 9 months ago
- A massively multilingual modern encoder language model☆80Updated last week
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆38Updated 5 months ago
- 🔔🧠 Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!☆54Updated 2 months ago
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆87Updated 2 weeks ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆55Updated 7 months ago
- Easy to use, High Performant Knowledge Distillation for LLMs☆92Updated 4 months ago
- Data preparation code for Amber 7B LLM☆91Updated last year
- ☆23Updated 7 months ago
- ☆135Updated 3 weeks ago