A Step-by-Step Implementation of Qwen 3 MoE Architecture from Scratch
☆81Aug 5, 2025Updated 10 months ago
Alternatives and similar repositories for qwen3-MoE-from-scratch
Users that are interested in qwen3-MoE-from-scratch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- code for paper "Discerning and Resolving Knowledge Conflicts through Adaptive Decoding with Contextual Information-Entropy Constraint"☆11Sep 29, 2024Updated last year
- 收集整理大模型面试题☆12Aug 29, 2024Updated last year
- Crawled Wikipedia Tables with Passages☆14Aug 19, 2021Updated 4 years ago
- LLM 101: 一起入门大语言模型 课程网站☆15Feb 2, 2025Updated last year
- An OpenAI API compatible images server to generate or manipulate images.☆18Feb 2, 2025Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- "rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Yandex Files☆11Oct 27, 2024Updated last year
- Being-M0.5: A Real-Time Controllable Vision-Language-Motion Model (ICCV 2025)☆37Sep 4, 2025Updated 9 months ago
- Utility programs to pipe data across a RDMA-capable network☆19Mar 14, 2026Updated 2 months ago
- 2018 京东AI时尚挑战赛,时尚风格识别冠军(All source code )☆12Sep 13, 2018Updated 7 years ago
- Training framework for Large Behavioral Models☆28Sep 17, 2025Updated 8 months ago
- Minimal TPU implementation with 8x8 systolic array and PyTorch integration☆62Jan 26, 2026Updated 4 months ago
- ☆47Apr 17, 2026Updated last month
- A Beginner's Guide to Monetizing Your Python AI Chatbot☆16Apr 22, 2025Updated last year
- Load and run Llama from safetensors files in C☆15Oct 24, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- [EMNLP 2022] Revisiting Grammatical Error Correction Evaluation and Beyond☆20Nov 25, 2022Updated 3 years ago
- Implementation of 12 AI agents evaluation techniques☆43Jul 31, 2025Updated 10 months ago
- Home page for Microsoft Phi-Ground tech-report☆23Sep 8, 2025Updated 9 months ago
- 自然语言处理_CCF大数据与计算智能大赛_面向数据安全治理的数据内容智能发现与分级分类☆11Nov 17, 2022Updated 3 years ago
- IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse☆104Mar 14, 2026Updated 2 months ago
- ☆26Nov 26, 2024Updated last year
- We enable LLM with personalization capability☆11Nov 16, 2023Updated 2 years ago
- Source code for the paper "C-LLM: Learn to Check Chinese Spelling Errors Character by Character"☆30Nov 19, 2024Updated last year
- A comprehensive hands-on project for learning GPU programming with CUDA and HIP, covering fundamental concepts through advanced optimizat…☆36Nov 20, 2025Updated 6 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Synthetic data generation for evaluating LLM symbolic and logic reasoning☆22Mar 6, 2026Updated 3 months ago
- 基于seq2edit (Gector) 的中文文本纠错。☆29Nov 15, 2022Updated 3 years ago
- ☆25Jul 24, 2024Updated last year
- 文档记录☆15Mar 16, 2021Updated 5 years ago
- ☆26May 26, 2024Updated 2 years ago
- ☆13Jun 3, 2020Updated 6 years ago
- ☆18Sep 19, 2023Updated 2 years ago
- ccf 2020 beike 问答匹配 B榜24名☆12Nov 27, 2022Updated 3 years ago
- Code for ICML 2025 paper | Joint Localization and Activation Editing for Low-Resource Fine-Tuning☆28Jun 18, 2025Updated 11 months ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Official code of our work, Representation Learning for Resource-Constrained Keyphrase Generation.☆11May 26, 2022Updated 4 years ago
- The GitHub following tool that does what we're all thinking but too polite to say.☆30Nov 26, 2025Updated 6 months ago
- Official implementation of our paper "Towards Reasoning in Large Language Models via Multi-Agent Peer Review Collaboration".☆14Nov 18, 2024Updated last year
- ☆17Apr 10, 2024Updated 2 years ago
- ☆12Feb 6, 2023Updated 3 years ago
- This is a detailed code demo on how to conduct Full-Param Supervised Fine-tuning (SFT) and DPO (Direct Preference Optimization)☆19Jan 9, 2025Updated last year
- Crawl data from articles of the New York Times website☆13Oct 23, 2019Updated 6 years ago