stanford-futuredata / Megatron-LMLinks
Ongoing research training transformer models at scale
☆38Updated last year
Alternatives and similar repositories for Megatron-LM
Users that are interested in Megatron-LM are comparing it to the libraries listed below
Sorting:
- ☆118Updated last year
- ☆45Updated 2 years ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- Just a bunch of benchmark logs for different LLMs☆119Updated last year
- An automated tool for discovering insights from research papaer corpora☆137Updated last year
- ☆86Updated last year
- A framework for orchestrating AI agents using a mermaid graph☆77Updated last year
- ☆68Updated last year
- ☆47Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Updated 2 months ago
- auto fine tune of models with synthetic data☆77Updated last year
- inference code for mixtral-8x7b-32kseqlen☆104Updated 2 years ago
- look how they massacred my boy☆63Updated last year
- ☆28Updated last year
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆222Updated last year
- ☆20Updated last year
- Scripts to create your own moe models using mlx☆90Updated last year
- A tree-based prefix cache library that allows rapid creation of looms: hierarchal branching pathways of LLM generations.☆77Updated 10 months ago
- Simple examples using Argilla tools to build AI☆57Updated last year
- Verbosity control for AI agents☆64Updated last year
- ☆88Updated 2 years ago
- Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)☆91Updated 11 months ago
- ☆19Updated last year
- ☆122Updated last year
- A seamless matchmaking application that is programmed with Cohere Command R+, Stanford NLP DSPy framework, Weaviate Vector store and Crew…☆59Updated last year
- Cerule - A Tiny Mighty Vision Model☆68Updated last month
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆62Updated last year
- LangChain chat model abstractions for dynamic failover, load balancing, chaos engineering, and more!☆84Updated last year
- Efficient vector database for hundred millions of embeddings.☆211Updated last year
- Mixing Language Models with Self-Verification and Meta-Verification☆111Updated last year