This project aims to replicate mainstream open-source model architectures with limited computational resources, implementing mini models with 100-200M parameters.
☆148Feb 10, 2026Updated 3 weeks ago
Alternatives and similar repositories for Mini-LLM
Users that are interested in Mini-LLM are comparing it to the libraries listed below
Sorting:
- 晚上下班不刷手机,学点什么。系列一:CUDA 计算框架 CUFX (Cuda Framework eXtended)。☆16Dec 15, 2024Updated last year
- An MLIR-based compiler that takes GPU kernels and compiles them to real hardware instructions. Interactive web visualizer included.☆109Feb 24, 2026Updated last week
- ☆53Feb 24, 2026Updated last week
- All Resources from Stanford CS106B 2021☆24Jul 11, 2025Updated 7 months ago
- The official github repo for the open online courses: "Dive into LLMs".☆10Mar 15, 2024Updated last year
- Students Engagement Detection Using Hybrid EfficientNetB7 Together With TCN, LSTM, and Bi-LSTM (DAiSEE and VRESEE datasets)☆11May 11, 2025Updated 9 months ago
- Deep Generative Models course, 2025☆11Jun 5, 2025Updated 9 months ago
- ☆11Dec 22, 2024Updated last year
- Multi-heap-sort for many small arrays, quicksort with 3 pivots for one big array, CUDA acceleration, CUDA memory compression.☆13Sep 29, 2024Updated last year
- GEMM☆10Aug 26, 2023Updated 2 years ago
- The officalimplement of dLLM-Factory☆26Jul 12, 2025Updated 7 months ago
- A std::execution style runtime context and High Performance RPC Transport for using OpenUCX. Including CUDA/ROCM/... devices with RDMA.☆29Feb 22, 2026Updated last week
- ☆16Apr 1, 2025Updated 11 months ago
- UCPR: User-Centric Path Reasoning towards Explainable Recommendation, SIGIR 2021☆12Jun 18, 2022Updated 3 years ago
- ☆10Jul 11, 2018Updated 7 years ago
- an implementation of paper"Retentive Network: A Successor to Transformer for Large Language Models" https://arxiv.org/pdf/2307.08621.pdf☆11Jul 25, 2023Updated 2 years ago
- ☆10Dec 8, 2022Updated 3 years ago
- MoE-Visualizer is a tool designed to visualize the selection of experts in Mixture-of-Experts (MoE) models.☆16Apr 8, 2025Updated 10 months ago
- useful cuda code .☆43Mar 11, 2022Updated 3 years ago
- ☆13Feb 6, 2025Updated last year
- GEMV implementation with CUTLASS☆19Aug 21, 2025Updated 6 months ago
- CAD - Memory Efficient Convolutional Adapter for Segment Anything☆12Oct 4, 2024Updated last year
- ☆12Aug 31, 2023Updated 2 years ago
- ☆13May 12, 2025Updated 9 months ago
- 一个基于爬虫的,对于北大树洞数据进行统计并分析的热榜☆12Mar 23, 2022Updated 3 years ago
- ☆39Dec 26, 2025Updated 2 months ago
- 一个用Apple Metal实现的Llama和通义千问大模型本地推理☆10Apr 26, 2024Updated last year
- Official repository of the paper "Explainable Deep Learning Methods in Medical Image Classification: A Survey", ACM Computing Surveys (CS…☆10Jan 9, 2024Updated 2 years ago
- Row-wise block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge.☆17Feb 9, 2026Updated 3 weeks ago
- ☆11Aug 19, 2024Updated last year
- [CVPR 2026] Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"☆112Feb 25, 2026Updated last week
- A PyTorch implementation of "Self-Supervised GNN that Jointly Learns to Augment" or "Jointly Learnable Data Augmentations for Self-Superv…☆13Dec 13, 2021Updated 4 years ago
- Offical code repository for PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation, EMNLP 2023☆12Dec 13, 2023Updated 2 years ago
- The system of SUDA-HUAWEI submitted at CAMR2022.☆11Nov 22, 2022Updated 3 years ago
- ☆16May 21, 2024Updated last year
- Source code of the "Graph-Bert: Only Attention is Needed for Learning Graph Representations" paper☆15Jan 22, 2020Updated 6 years ago
- ☆14Nov 3, 2025Updated 4 months ago
- 《开源大模型食用指南》基于Linux环境快速部署开源大模型,更适合中国宝宝的部署教程☆11Jun 8, 2024Updated last year
- ☆20Oct 28, 2024Updated last year