soketlabs / coomLinks
A training framework for large-scale language models based on Megatron-Core, the COOM Training Framework is designed to efficiently handle extensive model training inspired by Deepseek's HAI-LLM optimizations.
☆24Updated last week
Alternatives and similar repositories for coom
Users that are interested in coom are comparing it to the libraries listed below
Sorting:
- A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse range of tasks☆38Updated last year
- Training-Ready RL Environments + Evals☆116Updated last week
- ☆28Updated last year
- A zero-to-one guide on scaling modern transformers with n-dimensional parallelism.☆95Updated 2 weeks ago
- ☆44Updated 3 months ago
- A repository consisting of paper/architecture replications of classic/SOTA AI/ML papers in pytorch☆376Updated last week
- A blueprint for creating Pretraining and Fine-Tuning datasets for Indic languages☆116Updated last year
- So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset…☆15Updated 6 months ago
- ☆89Updated 6 months ago
- ☆225Updated 3 months ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆194Updated 4 months ago
- This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resultin…☆23Updated last year
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆48Updated last year
- everything i know about cuda and triton☆13Updated 8 months ago
- "LLM from Zero to Hero: An End-to-End Large Language Model Journey from Data to Application!"☆134Updated last week
- ☆46Updated 6 months ago
- ☆45Updated 4 months ago
- code for training & evaluating Contextual Document Embedding models☆197Updated 4 months ago
- Basically a repo containing architectures/algorithms/papers from scratch in pytorch☆30Updated 3 months ago
- rl from zero pretrain, can it be done? yes.☆275Updated last week
- Starter pack for NeurIPS LLM Efficiency Challenge 2023.☆126Updated 2 years ago
- ☆135Updated 6 months ago
- Following master Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish☆172Updated last year
- GPU Kernels☆200Updated 5 months ago
- [ACL'25] Official Code for LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs☆314Updated 2 months ago
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da☆116Updated 6 months ago
- Fine-tune an LLM to perform batch inference and online serving.☆112Updated 4 months ago
- small auto-grad engine inspired from Karpathy's micrograd and PyTorch☆277Updated 10 months ago
- Simple Transformer in Jax☆139Updated last year
- Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.☆53Updated 5 months ago