EleutherAI / oslo
OSLO: Open Source for Large-scale Optimization
โ175Updated last year
Alternatives and similar repositories for oslo:
Users that are interested in oslo are comparing it to the libraries listed below
- OSLO: Open Source framework for Large-scale model Optimizationโ308Updated 2 years ago
- Data processing system for polyglotโ91Updated last year
- Large scale 4D parallelism pre-training for ๐ค transformers in Mixture of Experts *(still work in progress)*โ81Updated last year
- evolve llm training instruction, from english instruction to any language.โ115Updated last year
- Inference code for LLaMA models in JAXโ116Updated 10 months ago
- some common Huggingface transformers in maximal update parametrization (ยตP)โ80Updated 3 years ago
- Exploring finetuning public checkpoints on filter 8K sequences on Pileโ115Updated 2 years ago
- Experiments with generating opensource language model assistantsโ97Updated last year
- โ67Updated 2 years ago
- JAX implementation of the Llama 2 modelโ216Updated last year
- โ14Updated last month
- A minimal PyTorch Lightning OpenAI GPT w DeepSpeed Training!โ111Updated last year
- Large-scale language modeling tutorials with PyTorchโ290Updated 3 years ago
- โ60Updated 3 years ago
- Train very large language models in Jax.โ203Updated last year
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.โ93Updated 2 years ago
- Easy Language Model Pretraining leveraging Huggingface's Transformers and Datasetsโ129Updated 2 years ago
- Multipack distributed sampler for fast padding-free training of LLMsโ186Updated 7 months ago
- [AAAI 2024] Investigating the Effectiveness of Task-Agnostic Prefix Prompt for Instruction Followingโ79Updated 6 months ago
- Sakura-SOLAR-DPO: Merge, SFT, and DPOโ116Updated last year
- data related codebase for polyglot projectโ19Updated 2 years ago
- Language models scale reliably with over-training and on downstream tasksโ96Updated last year
- A performance library for machine learning applications.โ183Updated last year
- Pytorch/XLA SPMD Test code in Google TPUโ23Updated last year
- Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)โ116Updated 3 years ago
- โ66Updated 2 years ago
- โ98Updated 10 months ago
- [Google Meet] MLLM Arxiv Casual Talkโ52Updated 2 years ago
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)โ187Updated 2 years ago
- Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorchโ226Updated 6 months ago