☆134Feb 17, 2025Updated last year
Alternatives and similar repositories for DeepSeek-MoE-ResourceMap
Users that are interested in DeepSeek-MoE-ResourceMap are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆40Jan 4, 2024Updated 2 years ago
- This is the official repo for the paper "Accelerating Parallel Sampling of Diffusion Models" Tang et al. ICML 2024 https://openreview.net…☆16Jul 19, 2024Updated last year
- ☆52Feb 5, 2025Updated last year
- OCRVerse: Towards Holistic OCR in End-to-End Vision-Language Models☆29Feb 4, 2026Updated last month
- Research Materials for Video Reflection Separation☆10Jan 23, 2020Updated 6 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 该系列的目的是让读者可以在基础的pytorch上,不依赖任何其他现成的外部库,从零开始理解并实现一个大语言模型的所有组成部分,以及训练微调代码,因此读者仅需python,pytorch和最基础深度学习背景知识即可。☆382Aug 28, 2025Updated 6 months ago
- ☆47Jun 10, 2025Updated 9 months ago
- Building DeepSeek R1 from Scratch☆749Mar 21, 2025Updated last year
- MLLM @ Game☆16May 12, 2025Updated 10 months ago
- ☆50Jun 7, 2025Updated 9 months ago
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆274Jan 20, 2026Updated 2 months ago
- ☆191Mar 13, 2026Updated last week
- Our 2nd-gen LMM☆34May 22, 2024Updated last year
- MathNet: A Data-Centric Approach, Dataset and Benchmark Model to Advance Mathematical Expression Recognition☆10Mar 19, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- This project aims to generate syntactichandwritten mathematical expression. The dataset is generated from the CROHME 2014 training set.☆14Feb 24, 2022Updated 4 years ago
- 顾名思义:手搓的RAG☆133Feb 27, 2024Updated 2 years ago
- ☆15Mar 21, 2025Updated last year
- Deeptoai 系列 RAG 教程☆98Oct 29, 2025Updated 4 months ago
- Simulator for LLM inference on an abstract 3D AIMC-based accelerator☆27Sep 18, 2025Updated 6 months ago
- Streaming Video Diffusion: Online Video Editing with Diffusion Models☆18Jun 3, 2024Updated last year
- [Ongoing Project] Codebase for network quantization study.☆12May 20, 2020Updated 5 years ago
- Code for 'Contrastive Multi-Document Question Generation'☆11Oct 16, 2022Updated 3 years ago
- Solutions of Kaggle Competition☆15Jan 28, 2018Updated 8 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- ☆14Jan 4, 2017Updated 9 years ago
- Stream live plots to a matplotlib figure☆81Apr 18, 2025Updated 11 months ago
- 基于pytorch的不平衡数据的文本分类☆12Dec 26, 2021Updated 4 years ago
- A simple Python tool to measure the performance of ONNX models.☆27Sep 15, 2024Updated last year
- Youtu-Parsing: Perception, Structuring and Recognition via High-Parallelism Decoding☆60Feb 10, 2026Updated last month
- ☆18Jan 1, 2023Updated 3 years ago
- Agently Stage - Efficient Convenient Asynchronous & Multithreaded Programming☆13Apr 2, 2025Updated 11 months ago
- 🔥Your Daily Dose of AI Research from Hugging Face 🔥 Stay updated with the latest AI breakthroughs! This bot automatically collects and…☆56Mar 18, 2026Updated last week
- ☆10Feb 13, 2023Updated 3 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- https://www.coursera.org/specializations/cloudcomputing☆11Apr 14, 2020Updated 5 years ago
- Musculoskeletal Analysis extension for 3D Slicer. Currently has cortical, cancellous, and bone density analysis.☆12May 2, 2024Updated last year
- a curated list of the role of small models in the LLM era☆110Sep 23, 2024Updated last year
- This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training …☆69May 7, 2025Updated 10 months ago
- ggml implementation of the baichuan13b model (adapted from llama.cpp)☆55Jul 27, 2023Updated 2 years ago
- ☆47Aug 23, 2021Updated 4 years ago
- Fetch arxiv data to LLM-friendly text☆130Feb 18, 2026Updated last month