This project aims to replicate mainstream open-source model architectures with limited computational resources, implementing mini models with 100-200M parameters.
☆189May 21, 2026Updated last week
Alternatives and similar repositories for Mini-LLM
Users that are interested in Mini-LLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 🚀全流程自己训练一个VLA 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!☆33Oct 16, 2025Updated 7 months ago
- let coding agents use ncu skills analysis cuda program automatically!☆98Updated this week
- 晚上下班不刷手机,学点什么。系列一:CUDA 计算框架 CUFX (Cuda Framework eXtended)。☆16Dec 15, 2024Updated last year
- An LLM training framework built from the ground up, featuring a custom BumbleBee architecture and end-to-end support for multiple open-so…☆98Apr 26, 2026Updated last month
- code for paper "Discerning and Resolving Knowledge Conflicts through Adaptive Decoding with Contextual Information-Entropy Constraint"☆12Sep 29, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Curated collection of AI inference engineering resources — LLM serving, GPU kernels, quantization, distributed inference, and production …☆130Feb 4, 2026Updated 3 months ago
- [CIKM 2025] Constraint Back-translation Improves Complex Instruction Following of Large Language Models☆18May 23, 2025Updated last year
- This is a repository to practice multi-thread programming in C++☆30Feb 21, 2024Updated 2 years ago
- ☆13Feb 17, 2025Updated last year
- Code for ACL22 short Paper "Hierarchical Curriculum Learning for AMR Parsing"☆13Jun 1, 2022Updated 3 years ago
- Triton Compiler related materials.☆44Mar 16, 2026Updated 2 months ago
- AC No Code 是偷懒者最好的在OJ中写代码AC的方式: Write nothing; submit nowhere.☆10May 18, 2020Updated 6 years ago
- 中华药典RAG项目☆10Oct 26, 2024Updated last year
- ☆49Apr 15, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ClusterKV: Manipulating LLM KV Cache in Semantic Space for Recallable Compression (DAC'25)☆28Feb 26, 2026Updated 3 months ago
- 基于LibGraphics的软渲染器☆19Apr 17, 2022Updated 4 years ago
- Code for 2020 AI plus wireless communication competition.☆16Nov 30, 2020Updated 5 years ago
- https://github.com/zyds/transformers-code☆19Jan 17, 2024Updated 2 years ago
- [CVPR 2026] Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"☆130Apr 7, 2026Updated last month
- KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation, NAACL 2024☆16Jul 29, 2024Updated last year
- Co-Reinforcement Learning for Unified Multimodal Understanding and Generation☆47Jul 22, 2025Updated 10 months ago
- The system of SUDA-HUAWEI submitted at CAMR2022.☆12Nov 22, 2022Updated 3 years ago
- The official github repo for the open online courses: "Dive into LLMs".☆10Mar 15, 2024Updated 2 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Demos of many Rosetta applications☆25Jun 10, 2025Updated 11 months ago
- alphafold FAPE loss☆10Sep 28, 2021Updated 4 years ago
- [CVPR 2026] An official implementation of "Think Visually, Reason Textually: Vision-Language Synergy in ARC"☆43Nov 26, 2025Updated 6 months ago
- ☆68Mar 4, 2026Updated 2 months ago
- GEMM☆10Aug 26, 2023Updated 2 years ago
- A multimodal large-scale model, which performs close to the closed-source Qwen-VL-PLUS on many datasets and significantly surpasses the p…☆14Feb 5, 2024Updated 2 years ago
- 亚博智能 Jetson Orin NX 课程资料文档个人汉化☆16Nov 7, 2024Updated last year
- CCL2024 Chinese Essay Rhetoric Recognition and Understanding☆17Oct 1, 2024Updated last year
- R files containing the code used to predict rugby world cup matches☆11Sep 18, 2015Updated 10 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Towards Real-World Writing Assistance: A Chinese Character Checking Benchmark with Faked and Misspelled Characters☆16May 30, 2024Updated last year
- ☆20Aug 5, 2025Updated 9 months ago
- The official GitHub repository for AC-EVAL, an ancient Chinese evaluation suite for large language models (LLMs)☆16Nov 12, 2024Updated last year
- Code for paper "Open-Domain Hierarchical Event Schema Induction by Incremental Prompting and Verification"☆16Jul 4, 2023Updated 2 years ago
- ☆11May 16, 2026Updated last week
- 一个用Apple Metal实现的Llama和通义千问大模型本地推理☆10Apr 26, 2024Updated 2 years ago
- Deep learning AI for generating new molecules that bond to the COVID-19.☆12Sep 17, 2020Updated 5 years ago