A Step-by-Step Implementation of Qwen 3 MoE Architecture from Scratch
☆77Aug 5, 2025Updated 7 months ago
Alternatives and similar repositories for qwen3-MoE-from-scratch
Users that are interested in qwen3-MoE-from-scratch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Write your next novel faster and easier☆15Dec 7, 2025Updated 3 months ago
- code for paper "Discerning and Resolving Knowledge Conflicts through Adaptive Decoding with Contextual Information-Entropy Constraint"☆12Sep 29, 2024Updated last year
- Crawled Wikipedia Tables with Passages☆13Aug 19, 2021Updated 4 years ago
- LLM 101: 一起入门大语言模型 课程网站☆14Feb 2, 2025Updated last year
- An OpenAI API compatible images server to generate or manipulate images.☆17Feb 2, 2025Updated last year
- Being-M0.5: A Real-Time Controllable Vision-Language-Motion Model (ICCV 2025)☆35Sep 4, 2025Updated 6 months ago
- Tree-Invent: A novel molecular generative model constrained with topological tree☆13Jul 26, 2023Updated 2 years ago
- Minimal TPU implementation with 8x8 systolic array and PyTorch integration☆56Jan 26, 2026Updated last month
- A Beginner's Guide to Monetizing Your Python AI Chatbot☆16Apr 22, 2025Updated 11 months ago
- ☆39Feb 25, 2026Updated last month
- [EMNLP 2022] Revisiting Grammatical Error Correction Evaluation and Beyond☆20Nov 25, 2022Updated 3 years ago
- From a+b to sparsemax(QK^T)V in Triton!☆28Jun 19, 2025Updated 9 months ago
- We enable LLM with personalization capability☆11Nov 16, 2023Updated 2 years ago
- A PDDL Solver in C++.☆15Jan 5, 2024Updated 2 years ago
- ☆16Nov 23, 2023Updated 2 years ago
- A comprehensive hands-on project for learning GPU programming with CUDA and HIP, covering fundamental concepts through advanced optimizat…☆35Nov 20, 2025Updated 4 months ago
- The official code and model for ACL 2023 paper 'mCLIP: Multilingual CLIP via Cross-lingual Transfer'☆10Jan 23, 2024Updated 2 years ago
- DRFI For Region Dissection☆13Jan 11, 2019Updated 7 years ago
- Synthetic data generation for evaluating LLM symbolic and logic reasoning☆22Mar 6, 2026Updated 2 weeks ago
- 中文金融大模型测评基准,六大类二十五任务、等级化评价,国内模型获得A级☆10May 6, 2024Updated last year
- 文档记录☆15Mar 16, 2021Updated 5 years ago
- ☆18Sep 19, 2023Updated 2 years ago
- The Energy Transformer block, in JAX☆64Dec 14, 2023Updated 2 years ago
- Write your code as tree-like expressions, then transform it☆21Jan 9, 2024Updated 2 years ago
- Official implementation of our paper "Towards Reasoning in Large Language Models via Multi-Agent Peer Review Collaboration".☆14Nov 18, 2024Updated last year
- Add custom text to image with Thumbor filter☆15Sep 17, 2019Updated 6 years ago
- ☆12Feb 6, 2023Updated 3 years ago
- Official repository for FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models☆34Sep 19, 2025Updated 6 months ago
- Variational autoencoder implementation in tensorflow following the classic paper by Kingma and Welling.☆13Jul 12, 2017Updated 8 years ago
- Detecting car parking slot on Open car park space☆13Oct 21, 2019Updated 6 years ago
- Crawl data from articles of the New York Times website☆13Oct 23, 2019Updated 6 years ago
- ☆12Aug 31, 2015Updated 10 years ago
- A hands on advanced RAG tutorials☆30Apr 10, 2025Updated 11 months ago
- Use LoRA technique to improve training Large Language Model☆13Jul 25, 2023Updated 2 years ago
- Official code repo for the paper "MemGUI-Bench: Benchmarking Memory of Mobile GUI Agents in Dynamic Environments"☆33Mar 9, 2026Updated 2 weeks ago
- ☆16Mar 6, 2020Updated 6 years ago
- 西班牙短文本匹配比赛,初赛8/1027,复赛5/1027☆19Aug 1, 2018Updated 7 years ago
- A language model suite for numbering antigen receptor sequences.☆40Updated this week
- reproduce of FlagAlpha/Llama2-Chinese☆25Oct 18, 2023Updated 2 years ago