soketlabs / coomLinks
A training framework for large-scale language models based on Megatron-Core, the COOM Training Framework is designed to efficiently handle extensive model training inspired by Deepseek's HAI-LLM optimizations.
☆23Updated last month
Alternatives and similar repositories for coom
Users that are interested in coom are comparing it to the libraries listed below
Sorting:
- Training-Ready RL Environments + Evals☆158Updated this week
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆194Updated 5 months ago
- "LLM from Zero to Hero: An End-to-End Large Language Model Journey from Data to Application!"☆134Updated 3 weeks ago
- ☆45Updated 5 months ago
- ☆46Updated 7 months ago
- everything i know about cuda and triton☆13Updated 9 months ago
- A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse range of tasks☆38Updated last year
- An interface library for RL post training with environments.☆461Updated this week
- A zero-to-one guide on scaling modern transformers with n-dimensional parallelism.☆104Updated last month
- So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset…☆15Updated 7 months ago
- A repository consisting of paper/architecture replications of classic/SOTA AI/ML papers in pytorch☆383Updated last month
- This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resultin…☆23Updated last year
- ☆29Updated last year
- rl from zero pretrain, can it be done? yes.☆279Updated last month
- ☆89Updated 6 months ago
- code for training & evaluating Contextual Document Embedding models☆199Updated 5 months ago
- Following master Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish☆172Updated last year
- ☆44Updated 4 months ago
- in this repository, i'm going to implement increasingly complex llm inference optimizations☆70Updated 5 months ago
- ☆224Updated 2 weeks ago
- This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementa…☆228Updated 10 months ago
- repo of paper implementations☆20Updated 8 months ago
- ⚖️ Awesome LLM Judges ⚖️☆132Updated 6 months ago
- GPU Kernels☆203Updated 6 months ago
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆48Updated last year
- An introduction to LLM Sampling☆79Updated 10 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆107Updated 7 months ago
- Learnings and programs related to CUDA☆422Updated 4 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆79Updated 7 months ago
- ☆135Updated 7 months ago