Building LLaMA 4 MoE from Scratch
☆72Apr 15, 2025Updated 10 months ago
Alternatives and similar repositories for train-llama4
Users that are interested in train-llama4 are comparing it to the libraries listed below
Sorting:
- Implementation of a GPT-4o like Multimodal from Scratch using Python☆78Apr 4, 2025Updated 11 months ago
- 动手训练一个简单的CLIP模型,加深对CLIP的理解。☆22May 20, 2025Updated 9 months ago
- ☆32Nov 9, 2024Updated last year
- ☆46May 24, 2025Updated 9 months ago
- ☆15Mar 18, 2025Updated 11 months ago
- [EMNLP 2024 Findings] Wrong-of-Thought: An Integrated Reasoning Framework with Multi-Perspective Verification and Wrong Information☆13Oct 1, 2024Updated last year
- [ICLR 2024 Spotlight] Social Reward: Evaluating and Enhancing Generative AI through Million-User Feedback from an Online Creative Communi…☆11Mar 29, 2024Updated last year
- A distilled DeepSeek-R1 variant built on Qwen2.5-32B, fine-tuned with curated data for enhanced performance and efficiency. <metadata> gp…☆16Mar 11, 2025Updated 11 months ago
- Example scripts for using [my] fine-tuned CLIP models with HuggingFace 🤗☆13Sep 24, 2024Updated last year
- Build and test environment for CMSIS-Pack containing TensorFlow Lite Micro☆17Jul 9, 2025Updated 7 months ago
- ☆11Jun 2, 2024Updated last year
- Modest Maps actionscript3 port☆38Aug 27, 2015Updated 10 years ago
- A modern and functional chat assistant based on Local LLM that streamlines internal processes. Weather, internal information, support req…☆15Jan 9, 2026Updated last month
- Fake News Detective uses NLP to identify and debunk fake news, helping people to stay informed and make informed decisions. It is a power…☆16Dec 7, 2023Updated 2 years ago
- ☆10Nov 5, 2020Updated 5 years ago
- ☆14May 27, 2025Updated 9 months ago
- LLM-guided hyperparameter tuning☆10Oct 7, 2023Updated 2 years ago
- ☆20Aug 5, 2025Updated 6 months ago
- DeepSearch - Advanced Web Dir Scanner☆14Nov 13, 2018Updated 7 years ago
- Created a simple neural network using C++17 standard and the Eigen library that supports both forward and backward propagation.☆10Jul 27, 2024Updated last year
- A skeleton Mezzanine project demonstrating how to deploy it to each PaaS provider☆43Oct 8, 2021Updated 4 years ago
- Persistent dense gemm for Hopper in `CuTeDSL`☆15Aug 9, 2025Updated 6 months ago
- ☆11Feb 3, 2025Updated last year
- An example preloader for Starling Framework running in Adobe Flash Player in a web browser☆24Dec 10, 2014Updated 11 years ago
- Swift package that houses commonly used functions, extensions, views, classes, etc.☆12Oct 25, 2025Updated 4 months ago
- pubg_sdk☆11Jul 26, 2020Updated 5 years ago
- Rucio K8s tutorial☆11Sep 26, 2025Updated 5 months ago
- MLOps (Machine Learning Operations) süreçlerini adım adım öğrenmek için tasarlanmış kapsamlı Türkçe eğitim materyali☆14Oct 19, 2025Updated 4 months ago
- AI agent that controls a computer☆53Feb 23, 2025Updated last year
- NPL.load("npl_packages/main/");☆11Feb 28, 2023Updated 3 years ago
- Codes for paper : "A Stroke-based RNN for Writer-Independent Online Signature Verification"☆12May 6, 2019Updated 6 years ago
- Learning to draw samples: with application to amortized maximum likelihood estimator for generative adversarial learning☆10Dec 28, 2021Updated 4 years ago
- Finger gesture detection in 8 direction using unity C#☆11Jan 18, 2021Updated 5 years ago
- Cobra website☆10Jul 13, 2024Updated last year
- Simple and efficient memory pool is implemented with C++11.☆10Jun 2, 2022Updated 3 years ago
- A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis.☆13Apr 21, 2022Updated 3 years ago
- Gamedent - Tooth Floss Web Application // React.tailwindcss☆12Jan 6, 2024Updated 2 years ago
- ☆10Mar 16, 2025Updated 11 months ago
- Fitbliss · Fitness Tracker Application · Java, Swing, SQL☆10Jul 12, 2023Updated 2 years ago