soketlabs / coomLinks
A training framework for large-scale language models based on Megatron-Core, the COOM Training Framework is designed to efficiently handle extensive model training inspired by Deepseek's HAI-LLM optimizations.
☆24Updated 2 weeks ago
Alternatives and similar repositories for coom
Users that are interested in coom are comparing it to the libraries listed below
Sorting:
- A repository consisting of paper/architecture replications of classic/SOTA AI/ML papers in pytorch☆390Updated 2 weeks ago
- A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse range of tasks☆38Updated last year
- Basically a repo containing architectures/algorithms/papers from scratch in pytorch☆30Updated last month
- Following Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish☆172Updated last year
- Training-Ready RL Environments + Evals☆177Updated last week
- ☆45Updated 6 months ago
- An interface library for RL post training with environments.☆753Updated this week
- "LLM from Zero to Hero: An End-to-End Large Language Model Journey from Data to Application!"☆141Updated last week
- ☆45Updated 5 months ago
- rl from zero pretrain, can it be done? yes.☆281Updated 2 months ago
- ☆29Updated last year
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆196Updated 5 months ago
- A zero-to-one guide on scaling modern transformers with n-dimensional parallelism.☆104Updated 2 months ago
- So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset…☆16Updated 8 months ago
- small auto-grad engine inspired from Karpathy's micrograd and PyTorch☆277Updated last year
- in this repository, i'm going to implement increasingly complex llm inference optimizations☆70Updated 6 months ago
- This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementa…☆228Updated 10 months ago
- Starter pack for NeurIPS LLM Efficiency Challenge 2023.☆128Updated 2 years ago
- ☆233Updated 5 months ago
- ☆530Updated 3 months ago
- ☆46Updated 7 months ago
- GPU Kernels☆209Updated 7 months ago
- ☆83Updated 2 months ago
- ☆92Updated last year
- Learnings and programs related to CUDA☆427Updated 5 months ago
- Repository of implementations of classic and sota rl algorithms from scratch in PyTorch☆211Updated 2 weeks ago
- Simple Transformer in Jax☆139Updated last year
- A blueprint for creating Pretraining and Fine-Tuning datasets for Indic languages☆252Updated last year
- flexible search engine for video data☆65Updated 3 weeks ago
- ☆136Updated 8 months ago