soketlabs / coomLinks
A training framework for large-scale language models based on Megatron-Core, the COOM Training Framework is designed to efficiently handle extensive model training inspired by Deepseek's HAI-LLM optimizations.
☆24Updated last month
Alternatives and similar repositories for coom
Users that are interested in coom are comparing it to the libraries listed below
Sorting:
- Curated collection of community environments☆200Updated this week
- A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse range of tasks☆38Updated last year
- ☆29Updated last year
- "LLM from Zero to Hero: An End-to-End Large Language Model Journey from Data to Application!"☆141Updated last week
- ☆233Updated this week
- Simple & Scalable Pretraining for Neural Architecture Research☆306Updated last month
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆195Updated 7 months ago
- A zero-to-one guide on scaling modern transformers with n-dimensional parallelism.☆112Updated last week
- in this repository, i'm going to implement increasingly complex llm inference optimizations☆79Updated 7 months ago
- code for training & evaluating Contextual Document Embedding models☆201Updated 7 months ago
- rl from zero pretrain, can it be done? yes.☆286Updated 3 months ago
- A repository consisting of paper/architecture replications of classic/SOTA AI/ML papers in pytorch☆400Updated last month
- ☆46Updated 9 months ago
- ⚖️ Awesome LLM Judges ⚖️☆148Updated 8 months ago
- ☆116Updated last week
- ☆68Updated 7 months ago
- ☆136Updated 9 months ago
- Entropy Based Sampling and Parallel CoT Decoding☆17Updated last year
- Fine-tune an LLM to perform batch inference and online serving.☆115Updated 7 months ago
- So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset…☆16Updated 9 months ago
- ~950 line, minimal, extensible LLM inference engine built from scratch.☆241Updated this week
- ☆45Updated 6 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆85Updated 9 months ago
- Basically a repo containing architectures/algorithms/papers from scratch in pytorch☆30Updated 2 months ago
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆174Updated 11 months ago
- everything i know about cuda and triton☆13Updated 11 months ago
- Library for text-to-text regression, applicable to any input string representation and allows pretraining and fine-tuning over multiple r…☆305Updated 3 weeks ago
- ☆89Updated 9 months ago
- Compiling useful links, papers, benchmarks, ideas, etc.☆46Updated 9 months ago
- Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.☆31Updated 8 months ago