soketlabs / coomLinks

A training framework for large-scale language models based on Megatron-Core, the COOM Training Framework is designed to efficiently handle extensive model training inspired by Deepseek's HAI-LLM optimizations.

☆23

Alternatives and similar repositories for coom

Users that are interested in coom are comparing it to the libraries listed below

Sorting:

PrimeIntellect-ai / prime-environments
Training-Ready RL Environments + Evals
☆158Updated this week
MekkCyber / TritonAcademy
A repository to unravel the language of GPUs, making their kernel conversations easy to understand
☆194Updated 5 months ago
silvaxxx1 / MyLLM
"LLM from Zero to Hero: An End-to-End Large Language Model Journey from Data to Application!"
☆134Updated 3 weeks ago
hkproj / multi-latent-attention
☆45Updated 5 months ago
kmohan321 / Research_Papers
☆46Updated 7 months ago
aniket-mish / cuda
everything i know about cuda and triton
☆13Updated 9 months ago
adithya-s-k / indic_eval
A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse range of tasks
☆38Updated last year
meta-pytorch / OpenEnv
An interface library for RL post training with environments.
☆461Updated this week
divyamakkar0 / JAXformer
A zero-to-one guide on scaling modern transformers with n-dimensional parallelism.
☆104Updated last month
YuvrajSingh-mist / SmolLlama
So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset…
☆15Updated 7 months ago
YuvrajSingh-mist / Paper-Replications
A repository consisting of paper/architecture replications of classic/SOTA AI/ML papers in pytorch
☆383Updated last month
pacman100 / openhathi_instruct
This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resultin…
☆23Updated last year
Cohere-Labs-Community / AI-Alignment-Cohort
☆29Updated last year
tokenbender / avataRL
rl from zero pretrain, can it be done? yes.
☆279Updated last month
kmohan321 / LLMs
☆89Updated 6 months ago
jxmorris12 / cde
code for training & evaluating Contextual Document Embedding models
☆199Updated 5 months ago
Laz4rz / GPT-2
Following master Karpathy with GPT-2 implementation and training, writing lots of comments cause I have memory of a goldfish
☆172Updated last year
KhoomeiK / sanskrit-ocr
☆44Updated 4 months ago
naklecha / llm-inference-optimizations-explained
in this repository, i'm going to implement increasingly complex llm inference optimizations
☆70Updated 5 months ago
huggingface / picotron_tutorial
☆224Updated 2 weeks ago
0xD4rky / Vision-Transformers
This repo has all the basic things you'll need in-order to understand complete vision transformer architecture and its various implementa…
☆228Updated 10 months ago
SwekeR-463 / Papers-Implemented
repo of paper implementations
☆20Updated 8 months ago
haizelabs / Awesome-LLM-Judges
⚖️ Awesome LLM Judges ⚖️
☆132Updated 6 months ago
1y33 / 100Days
GPU Kernels
☆203Updated 6 months ago
ThinamXx / Meta-llama
Complete implementation of Llama2 with/without KV cache & inference 🚀
☆48Updated last year
Pleias / Quest-Best-Tokens
An introduction to LLM Sampling
☆79Updated 10 months ago
JoeLi12345 / nGPT
an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)
☆107Updated 7 months ago
Maharshi-Pandya / cudacodes
Learnings and programs related to CUDA
☆422Updated 4 months ago
google-deepmind / latent-multi-hop-reasoning
[ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?
☆79Updated 7 months ago
PrimeIntellect-ai / genesys
☆135Updated 7 months ago