根据Qwen2(Qwen1.5)模型生成qwen2 MoE模型的工具
☆15Mar 29, 2024Updated 2 years ago
Alternatives and similar repositories for qwen2_moe_mergekit
Users that are interested in qwen2_moe_mergekit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12Jul 8, 2022Updated 3 years ago
- qwen2 and llama3 cpp implementation☆50Jun 7, 2024Updated last year
- ☆29Aug 14, 2023Updated 2 years ago
- text classification compitioin top 10 strategy☆18Aug 14, 2021Updated 4 years ago
- wePoker is a multi-player poker game for Android☆11Mar 20, 2013Updated 13 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆14Oct 11, 2023Updated 2 years ago
- Official implementation of (ICML 2026) Training-Free Vector Quantization via Gaussian VAEs☆23Jan 3, 2026Updated 4 months ago
- Neural image compression models optimized for Mask R-CNN from paper "Boosting Neural Image Compression for Machines Using Latent Space Ma…☆10Aug 16, 2022Updated 3 years ago
- [IROS 2016] Implements an adaptive gating sensor fusion approach for object detection based on a mixture of convolutional neural network…☆10Mar 16, 2020Updated 6 years ago
- Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic☆32Feb 18, 2026Updated 3 months ago
- Code for the paper "Unbiased Supervised Contrastive Learning" | ICLR 2023 https://openreview.net/forum?id=Ph5cJSfD2XN☆12Sep 22, 2023Updated 2 years ago
- [Findings of EMNLP 2024] AdaMoE: Token-Adaptive Routing with Null Experts for Mixture-of-Experts Language Models☆20Oct 2, 2024Updated last year
- meta-comprehensive-rag-benchmark-kdd-cup-2024 phase1 task1 rank3☆21Jun 21, 2024Updated last year
- Official repository of "LiNeS: Post-training Layer Scaling Prevents Forgetting and Enhances Model Merging"☆33Nov 4, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆10Jul 23, 2021Updated 4 years ago
- ☆17Apr 17, 2024Updated 2 years ago
- Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient☆67Aug 3, 2025Updated 9 months ago
- [NeurIPS 2024] Image Understanding Makes for A Good Tokenizer for Image Generation☆22Dec 17, 2024Updated last year
- GFPGAN face reconstruction with ncnn on a bare Raspberry Pi☆14Jan 4, 2023Updated 3 years ago
- Robust Domain Adaptation under Noisy Environments☆18Jul 22, 2022Updated 3 years ago
- Daily paper reading records☆15Mar 31, 2025Updated last year
- [CVPR'26] AdapTok: Learning Adaptive and Temporally Causal Video Tokenization in a 1D Latent Space☆27Mar 15, 2026Updated 2 months ago
- The official repository of "Whoever Started the Interference Should End It: Guiding Data-Free Model Merging via Task Vectors""☆49Oct 1, 2025Updated 7 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- [TOIS 2023] On the User Behavior Leakage from Recommender System Exposure☆19Nov 7, 2023Updated 2 years ago
- [SIGKDD 2023] HardSATGEN: Understanding the Difficulty of Hard SAT Formula Generation and A Strong Structure-Hardness-Aware Baseline☆22Jun 16, 2023Updated 2 years ago
- Basel morphable face model mesh and texture generator using GPU.☆14Sep 14, 2020Updated 5 years ago
- ☆15Mar 5, 2024Updated 2 years ago
- rewrite python scipy.signal.lfilter in c code☆11Aug 13, 2019Updated 6 years ago
- A MaskGIT port from JAX to PyTorch☆18Jun 18, 2022Updated 3 years ago
- ☆32Jul 8, 2024Updated last year
- Official implementation of "Controlling Rate, Distortion, and Realism: Towards a Single Comprehensive Neural Image Compression Model" (WA…☆23Jun 19, 2024Updated last year
- ☆56Nov 6, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 知识图谱从入门到精通☆33Nov 27, 2020Updated 5 years ago
- Janus NDI Plugin☆14Nov 2, 2025Updated 6 months ago
- Run pytorch models on GPU Android with Vulkan backend☆10Aug 15, 2023Updated 2 years ago
- 微服务开发脚手架☆16Jun 25, 2024Updated last year
- ☆27May 27, 2024Updated 2 years ago
- A High performance and tiny TVM graph executor library written in C which can compile to WebAssembly and use CUDA/WebGPU as the accelerat…☆12Aug 3, 2023Updated 2 years ago
- This repository shows a demo of real-time Digital Makeup for a face. It can transference the hair style, foundation make-up, eyelash, lip…☆13Jul 15, 2018Updated 7 years ago