Building LLaMA 4 MoE from Scratch
☆73Apr 15, 2025Updated 11 months ago
Alternatives and similar repositories for train-llama4
Users that are interested in train-llama4 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of a GPT-4o like Multimodal from Scratch using Python☆78Apr 4, 2025Updated 11 months ago
- Train a 29M parameter GPT from Scratch☆35Mar 4, 2025Updated last year
- ☆27Jan 22, 2026Updated 2 months ago
- Rucio K8s tutorial☆11Sep 26, 2025Updated 5 months ago
- PyTorch implementation of GRPO.☆15Apr 21, 2025Updated 11 months ago
- ☆11Feb 27, 2024Updated 2 years ago
- Synthetic Data Generator for Machine Learning Pipelines☆33Sep 2, 2025Updated 6 months ago
- ☆13Dec 6, 2024Updated last year
- ☆15Apr 29, 2025Updated 10 months ago
- Geographical Graph Attention Networks: Spatial Deep Learning Models for Spatial Prediction and Exploratory Spatial Data Analysis☆17Jul 28, 2025Updated 7 months ago
- ☆14Jun 16, 2020Updated 5 years ago
- API for toxic text classification, utilized pre-trained Distilbert and trained on Kaggle datasets. It helps identify and handle toxic con…☆14Apr 30, 2024Updated last year
- Vietnamese Large Language Model (LLM) fine-tuned for the task of Question Answering within the medical and healthcare domain☆26Mar 1, 2024Updated 2 years ago
- Based on BrainTransformers, BrainGPTForCausalLM is a Large Language Model (LLM) implemented using Spiking Neural Networks (SNN). We are e…☆32Oct 22, 2024Updated last year
- A rust-version of NVIDIA BlueField DOCA kit.☆14Jun 11, 2023Updated 2 years ago
- The Multimodal Model for Vietnamese Visual Question Answering (ViVQA)☆21Jul 29, 2024Updated last year
- Classify documents using Python based on SVM and TF-IDF.☆15Nov 19, 2019Updated 6 years ago
- Persistent dense gemm for Hopper in `CuTeDSL`☆15Aug 9, 2025Updated 7 months ago
- Notes and commented code for RLHF (PPO)☆127Feb 27, 2024Updated 2 years ago
- Our solution to ML Talent Match hackathon☆10Mar 22, 2024Updated 2 years ago
- Benchmark tests supporting the TiledCUDA library.☆18Nov 19, 2024Updated last year
- AI Powered Transform raw notes into polished, professional formats☆35Aug 16, 2025Updated 7 months ago
- Automation Chatbot☆21Jan 1, 2025Updated last year
- EDA toolchain for processing-in-memory architectures, including an architecture synthesizer, a compiler, and a simulator☆19Jun 12, 2025Updated 9 months ago
- Open-source examples and guides for building with the Qwen. Browse a collection of snippets, advanced techniques and walkthroughs.☆38Nov 20, 2024Updated last year
- ☆17Apr 9, 2025Updated 11 months ago
- Created a simple neural network using C++17 standard and the Eigen library that supports both forward and backward propagation.☆10Jul 27, 2024Updated last year
- Code from Chris Valasek @nudehaberdasher and Charlie Miller @0xcharlie car hack: http://blog.ioactive.com/2013/08/car-hacking-content.ht…☆14Oct 1, 2020Updated 5 years ago
- Simple and efficient memory pool is implemented with C++11.☆10Jun 2, 2022Updated 3 years ago
- MLOps for Image Caption Generator.☆25Nov 27, 2023Updated 2 years ago
- A parser combinator in Ruby, with a pretty DSL☆11Jun 25, 2017Updated 8 years ago
- pytorch版基于gpt+nezha的中文多轮Cdial☆12Oct 22, 2022Updated 3 years ago
- A Step-by-Step Implementation of Google Veo 3 Architecture from Scratch☆82Jun 16, 2025Updated 9 months ago
- 数据库内核笔记☆13Aug 18, 2022Updated 3 years ago
- Langchain_CrewAI_Gemini - An Gemini AI powered AI Agent (Multi-Agent) Project.☆13Mar 24, 2024Updated 2 years ago
- LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.☆204Aug 23, 2024Updated last year
- ☆27Jun 12, 2025Updated 9 months ago
- GEMV implementation with CUTLASS☆19Aug 21, 2025Updated 7 months ago
- The official implementation of the paper "Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models" (NeurIPS 2025 Pos…☆70Sep 29, 2025Updated 5 months ago