vvr-rao / Training-a-Mini-114M-Parameter-Llama-3-like-Model-from-ScratchView external linksLinks
Trained a 114 million Parameter LLM from Scratch.
☆19Jul 21, 2024Updated last year
Alternatives and similar repositories for Training-a-Mini-114M-Parameter-Llama-3-like-Model-from-Scratch
Users that are interested in Training-a-Mini-114M-Parameter-Llama-3-like-Model-from-Scratch are comparing it to the libraries listed below
Sorting:
- Estimate geoadditive spatial or spatio-temporal econometric models☆12Jul 4, 2022Updated 3 years ago
- CMIP6 climate data extraction and treatment by multi-polygon shapefile using ESGF NetCDF and WORLDCLIM datasets☆11Sep 25, 2021Updated 4 years ago
- ☆12Dec 14, 2024Updated last year
- CATBench, the Intel Cache Allocation Technology benchmarking suite described in our tech report, "Simple Cache Partitioning for Networked…☆12Oct 6, 2017Updated 8 years ago
- A minimum demo for PyTorch distributed extension functionality for collectives.☆15Jul 29, 2024Updated last year
- ☆10Sep 22, 2020Updated 5 years ago
- Python client for Jikan.moe, MyAnimeList unofficial API with good intentions.☆14Dec 20, 2022Updated 3 years ago
- This repository contains the complete source code that we used to conduct experiments in the paper: Text Window Denoising Autoencoder: Bu…☆15Jun 12, 2013Updated 12 years ago
- Community Detection algorithms for LightGraphs☆14Dec 18, 2025Updated 2 months ago
- ☆15Apr 18, 2023Updated 2 years ago
- ☆12Nov 24, 2020Updated 5 years ago
- This is the respository that holds the artifacts of ASPLOS'25 -- M5: Mastering Page Migration and Memory Management for CXL-based Tiered …☆16Apr 1, 2025Updated 10 months ago
- CacheDirector - Sending Packets to the Right Slice by Exploiting Intel Last-Level Cache Addressing☆12Apr 29, 2019Updated 6 years ago
- Python library for interacting with Dask clusters in Saturn☆12Sep 4, 2025Updated 5 months ago
- Pure Julia implementation for reading/writing data in the Avro format☆17May 10, 2024Updated last year
- A replication of the paper "Adaptive Mixtures of Local Experts" applied to the CIFAR-10 image classification dataset.☆12Mar 19, 2021Updated 4 years ago
- Environment equipped with reinforcement learning algorithms to train agents to play tic-tac-toe.☆13Mar 4, 2023Updated 2 years ago
- Code for my Medium article: "How you can quickly deploy your ML models with FastAPI"☆12Mar 18, 2021Updated 4 years ago
- Not regularly updated clone of http://git.dpdk.org/dpdk-stable/ with the purpose to develop a new driver for corundum/mqnic (https://gith…☆15Aug 24, 2023Updated 2 years ago
- ☆15Jun 22, 2022Updated 3 years ago
- Manipulation testing using local polynomial density methods.☆13Jan 23, 2025Updated last year
- A straightforward explanation of how DeepSeek R1 works☆17Feb 7, 2025Updated last year
- Datasets for training and evaluating Ancient Greek sentence embedding models☆15Jul 12, 2024Updated last year
- TabDDPM is the state of the art synthetic data generation tool using diffusion models. Here I wrap the diffusion model in an easier plug …☆17Jun 26, 2025Updated 7 months ago
- ☆16Mar 23, 2023Updated 2 years ago
- Simplistic Implementation of Zipformer:A faster and better encoder for automatic speech recognition in PyTorch☆18Jun 3, 2024Updated last year
- A simple implementation of a GPT-style Transformer architecture and inference.☆15Jan 26, 2024Updated 2 years ago
- ☆18May 8, 2021Updated 4 years ago
- plget is a tool used to measure latency packets spent in network stack, NIC driver and on the wire, trace interpacket gap, based as on h/…☆16Nov 18, 2019Updated 6 years ago
- Portfolio optimization package in Python.☆16Feb 20, 2020Updated 5 years ago
- Metrics for evaluation of Machine Learning and Deep Learning Models☆16Apr 20, 2024Updated last year
- Convolutional Neural Network for Click-Through Rate prediction.☆15Sep 28, 2016Updated 9 years ago
- ☆49Sep 26, 2025Updated 4 months ago
- Light tool to read and write encrypted pandas dataframes.☆15Mar 19, 2024Updated last year
- ☆20Jul 5, 2024Updated last year
- Efficient Neural Interaction Functions Search for Collaborative Filtering☆18Feb 15, 2020Updated 6 years ago
- ☆20Feb 2, 2025Updated last year
- Large matrix multiplication in CUDA☆17Oct 20, 2023Updated 2 years ago
- Data structures for graph neural network☆18May 25, 2024Updated last year