☆134Feb 17, 2025Updated last year
Alternatives and similar repositories for DeepSeek-MoE-ResourceMap
Users that are interested in DeepSeek-MoE-ResourceMap are comparing it to the libraries listed below
Sorting:
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆40Jan 4, 2024Updated 2 years ago
- OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models☆29Feb 4, 2026Updated last month
- ☆45Jun 10, 2025Updated 8 months ago
- MLLM @ Game☆16May 12, 2025Updated 9 months ago
- Agently Stage - Efficient Convenient Asynchronous & Multithreaded Programming☆13Apr 2, 2025Updated 11 months ago
- 该系列的目的是让读者可以在基础的pytorch上,不依赖任何 其他现成的外部库,从零开始理解并实现一个大语言模型的所有组成部分,以及训练微调代码,因此读者仅需python,pytorch和最基础深度学习背景知识即可。☆379Aug 28, 2025Updated 6 months ago
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆270Jan 20, 2026Updated last month
- ☆190Feb 5, 2026Updated last month
- [NeurIPS 2024] Search for Efficient LLMs☆16Jan 16, 2025Updated last year
- ☆23Jun 30, 2025Updated 8 months ago
- Multinomial Factorization Machines☆21Oct 17, 2016Updated 9 years ago
- ☆50Jun 7, 2025Updated 8 months ago
- Benchmarking Attention Mechanism in Vision Transformers.☆20Oct 10, 2022Updated 3 years ago
- ☆26Apr 14, 2025Updated 10 months ago
- 顾名思义:手搓的RAG☆132Feb 27, 2024Updated 2 years ago
- Stream live plots to a matplotlib figure☆81Apr 18, 2025Updated 10 months ago
- Simulator for LLM inference on an abstract 3D AIMC-based accelerator☆25Sep 18, 2025Updated 5 months ago
- Youtu-Parsing: Perception, Structuring and Recognition via High-Parallelism Decoding☆57Feb 10, 2026Updated 3 weeks ago
- solo-learn: a library of self-supervised methods for visual representation learning powered by Pytorch Lightning☆23Jan 19, 2026Updated last month
- The official repo for "Unified Domain Adaptive Semantic Segmentation" (IEEE TPAMI 2025)☆33Aug 14, 2025Updated 6 months ago
- [ECCV 2024] This is the official implementation of "Stitched ViTs are Flexible Vision Backbones".☆29Jan 23, 2024Updated 2 years ago
- Building DeepSeek R1 from Scratch☆747Mar 21, 2025Updated 11 months ago
- ☆49Aug 14, 2025Updated 6 months ago
- A simple Python tool to measure the performance of ONNX models.☆27Sep 15, 2024Updated last year
- 😎 Awesome lists of papers and codes about open-vocabulary perception, including both 3D and 2D☆64Jul 27, 2025Updated 7 months ago
- 🔥Your Daily Dose of AI Research from Hugging Face 🔥 Stay updated with the latest AI breakthroughs! This bot automatically collects and…☆56Feb 26, 2026Updated last week
- Official code for the paper "Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark"☆29Jun 30, 2025Updated 8 months ago
- ggml implementation of the baichuan13b model (adapted from llama.cpp)☆55Jul 27, 2023Updated 2 years ago
- A repository aimed at pruning DeepSeek V3, R1 and R1-zero to a usable size☆83Sep 5, 2025Updated 6 months ago
- ICLR 2025☆31May 21, 2025Updated 9 months ago
- ☆341Oct 11, 2025Updated 4 months ago
- ☆40Jul 15, 2025Updated 7 months ago
- ☆28Dec 2, 2024Updated last year
- Musculoskeletal Analysis extension for 3D Slicer. Currently has cortical, cancellous, and bone density analysis.☆12May 2, 2024Updated last year
- Deep Learning experiments of UCAS☆18Jun 25, 2019Updated 6 years ago
- 基于改进YOLOv8与DeepSeek微调的智能交通监控与问答系统☆19Apr 16, 2025Updated 10 months ago
- ☆25Jun 24, 2021Updated 4 years ago
- ☆27Dec 13, 2022Updated 3 years ago
- support BM25+vecetor☆29May 26, 2025Updated 9 months ago