This project aims to replicate mainstream open-source model architectures with limited computational resources, implementing mini models with 100-200M parameters.
☆151Feb 10, 2026Updated last month
Alternatives and similar repositories for Mini-LLM
Users that are interested in Mini-LLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆12Dec 22, 2024Updated last year
- An LLM training framework built from the ground up, featuring a custom BumbleBee architecture and end-to-end support for multiple open-so…☆63Feb 9, 2026Updated last month
- 晚上下班不刷手机,学点什么。系列一:CUDA 计算框架 CUFX (Cuda Framework eXtended)。☆16Dec 15, 2024Updated last year
- 📄 A Claude skill for comprehensive academic paper analysis — deep reports, mind maps, peer review, and promo scripts.☆48Mar 9, 2026Updated 2 weeks ago
- code for paper "Discerning and Resolving Knowledge Conflicts through Adaptive Decoding with Contextual Information-Entropy Constraint"☆12Sep 29, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Implementation of Direct Preference Optimization☆17Jul 17, 2023Updated 2 years ago
- 零实现 AlphaGo Zero☆17Nov 10, 2024Updated last year
- ☆13Feb 17, 2025Updated last year
- ☆53Feb 24, 2026Updated last month
- 2018年春季工科创IV-E:智能小车机器人☆10May 10, 2018Updated 7 years ago
- Code for ICLR 2025 Paper "GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment"☆20Feb 10, 2025Updated last year
- ☆10Jul 11, 2018Updated 7 years ago
- ☆19Oct 28, 2025Updated 4 months ago
- Co-Reinforcement Learning for Unified Multimodal Understanding and Generation☆41Jul 22, 2025Updated 8 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆22Feb 4, 2026Updated last month
- Building Llama 3 from scratch using PyTorch☆13Sep 1, 2024Updated last year
- ClusterKV: Manipulating LLM KV Cache in Semantic Space for Recallable Compression (DAC'25)☆27Feb 26, 2026Updated last month
- 基于LibGraphics的软渲染器☆19Apr 17, 2022Updated 3 years ago
- [CVPR 2026] Official release of "Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning"☆121Feb 25, 2026Updated last month
- [ICLR 2026] An official implementation of "STAR-Bench: Probing Deep Spatio-Temporal Reasoning as Audio 4D Intelligence"☆40Jan 17, 2026Updated 2 months ago
- KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation, NAACL 2024☆16Jul 29, 2024Updated last year
- Chinese Characters Visualization & Chinese Text Augmentation.☆17Sep 19, 2022Updated 3 years ago
- The official github repo for the open online courses: "Dive into LLMs".☆10Mar 15, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆12Jan 19, 2026Updated 2 months ago
- GEMM☆10Aug 26, 2023Updated 2 years ago
- [EMNLP 2024 Tutorial] Language Agents: Foundations, Prospects, and Risks☆10Nov 27, 2024Updated last year
- A multimodal large-scale model, which performs close to the closed-source Qwen-VL-PLUS on many datasets and significantly surpasses the p…☆14Feb 5, 2024Updated 2 years ago
- ☆17Apr 11, 2021Updated 4 years ago
- R files containing the code used to predict rugby world cup matches☆10Sep 18, 2015Updated 10 years ago
- Towards Real-World Writing Assistance: A Chinese Character Checking Benchmark with Faked and Misspelled Characters☆16May 30, 2024Updated last year
- ☆13May 12, 2025Updated 10 months ago
- 2021科大讯飞试题标签预测挑战赛亚军方案☆12Dec 4, 2021Updated 4 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- The official GitHub repository for AC-EVAL, an ancient Chinese evaluation suite for large language models (LLMs)☆16Nov 12, 2024Updated last year
- ☆11Sep 21, 2022Updated 3 years ago
- Code for paper "Open-Domain Hierarchical Event Schema Induction by Incremental Prompting and Verification"☆16Jul 4, 2023Updated 2 years ago
- android批量导入2003 excel表格实现群发短信,需要excel表格的第一列为电话号码。☆14Apr 13, 2017Updated 8 years ago
- 只需要有订阅链接,在linux命令行中轻松使用clash代理☆51Jul 28, 2025Updated 7 months ago
- A std::execution style runtime context and High Performance RPC Transport for using OpenUCX. Including CUDA/ROCM/... devices with RDMA.☆30Feb 22, 2026Updated last month
- Yixuan Wang's personal blog.☆13Feb 19, 2026Updated last month