So, I trained a Llama a 130M architecture I coded from ground up to build a small instruct model from scratch. Trained on FineWeb dataset form HuggingFace consisting of 15 M texts (10BT snapshot) for a total of full 3 epochs
☆18Mar 26, 2025Updated last year
Alternatives and similar repositories for SmolLlama
Users that are interested in SmolLlama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Learning notes and code from CS 329S: Machine Learning Systems Design series.☆23Jun 23, 2025Updated 11 months ago
- This repository includes examples of using Microsoft Semantic Kernel with local LLMS via Ollama☆10May 14, 2024Updated 2 years ago
- This is a simple example of how to serve a DeepSeek model with Azure ML.☆10Feb 10, 2025Updated last year
- This GUI aims to simplify the process of converting GGUF files to llamafile format by providing an intuitive and convenient way for users…☆14Jan 2, 2026Updated 5 months ago
- ☆16Jan 6, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Agent CLI☆13Updated this week
- The UnisonAI Multi-Agent Framework built on custom workflow which allows ai agents to talk together and provides a flexible and extensibl…☆23Feb 24, 2026Updated 3 months ago
- ☆12Dec 20, 2024Updated last year
- An repository of 2025-2026 AI Safety and Alignment programs, camps, and workshops.☆21May 18, 2025Updated last year
- A guide to structured generation using constrained decoding☆18Jun 9, 2024Updated 2 years ago
- Implementation of CoLA: Compute-Efficient Pre-Training of LLMs via Low-Rank Activation☆26Feb 18, 2025Updated last year
- Template repository of a machine-learning Python project powered by FastAPI and PyTorch☆15Aug 26, 2021Updated 4 years ago
- ChatBot App built using LangChain and Lightning AI☆16Mar 4, 2023Updated 3 years ago
- 智能设计实验工具 Artificial Intelligence for Graph Design☆20Dec 12, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- JAX port of FLUX.1 models using flax.nnx☆23Sep 28, 2024Updated last year
- A tool to build a graph from a codebase☆14Feb 19, 2025Updated last year
- Helping you kickstart your AI journey!☆13Aug 23, 2024Updated last year
- ☆23Jun 6, 2025Updated last year
- Samples of good AI generated CUDA kernels☆105May 30, 2025Updated last year
- Jarvis made by Kaushik Shresth Reverse Engineered by Likhi☆16Feb 16, 2025Updated last year
- ☆14Aug 29, 2023Updated 2 years ago
- A collection of awesome lists that are about a variety of different topics.☆46May 5, 2026Updated last month
- A NodeJS application to upload, watch and stream live videos.☆12Jan 24, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆39Mar 25, 2026Updated 2 months ago
- Code and scripts for NAACL 2022 industry track paper "Fast and Light-weight Answer Text Retrieval in Dialogue Systems". Built on top of C…☆13Sep 17, 2025Updated 9 months ago
- This is the official implementation of the voxel-based humanoid locomotion in "Gallant: Voxel Grid-based Humanoid Locomotion and Local-na…☆77Apr 24, 2026Updated last month
- A tool to help you generate java call graph.☆10Apr 14, 2021Updated 5 years ago
- Transformer-based autoregressive varitional autoencoder☆12Feb 10, 2020Updated 6 years ago
- ☆50Jan 28, 2025Updated last year
- The SQL-RL-GEN is an algorithm based on a Reinforcement Learning approach with a reward function generated by a LLM to guide the agent's …☆25Sep 18, 2025Updated 9 months ago
- https://openreview.net/forum?id=OC1o4_OI6Jw☆13May 27, 2022Updated 4 years ago
- Using Seq2Seq transformers for Text2SQL task on WikiSQL dataset.☆12Jan 8, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Optimizing diffusion for production-ready speeds☆40Jan 10, 2026Updated 5 months ago
- ☆13Jun 6, 2022Updated 4 years ago
- Expand -> Retrieve -> Rerank - simple method with strong results on BRIGHT benchmark☆22Aug 22, 2025Updated 9 months ago
- A T5 based sequence generation model for WikiSQL task. Achieving 90.3% on test data set using sequence generation.☆17Nov 11, 2020Updated 5 years ago
- ☆102Feb 11, 2026Updated 4 months ago
- AutoLog: Anomaly Detection by Deep Autoencoding of System Logs☆11Oct 28, 2021Updated 4 years ago
- ☆21Sep 6, 2021Updated 4 years ago