An Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales
☆16Jun 6, 2024Updated last year
Alternatives and similar repositories for nanoLM
Users that are interested in nanoLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆18Sep 5, 2024Updated last year
- Masked Structural Growth for 2x Faster Language Model Pre-training☆25Apr 28, 2024Updated last year
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- Lightweight and minimal dom template and ajax helpers☆19Dec 15, 2023Updated 2 years ago
- Interpretable Charge Predictions for Criminal Cases: Learning to Generate Court Views from Fact Descriptions☆15May 7, 2018Updated 7 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- JAX implementation of GPTQ quantization algorithm☆10Jul 19, 2023Updated 2 years ago
- A family of efficient edge language models in 100M~1B sizes.☆19Feb 14, 2025Updated last year
- An implementation of the hammer2 filesystem for Plan 9☆19Nov 25, 2018Updated 7 years ago
- I use various Data Science and machine learning techniques to analyze customer data using STP framework. I preprocessed the data, perform…☆12Apr 26, 2020Updated 5 years ago
- Parallel Associative Scan for Language Models☆18Jan 8, 2024Updated 2 years ago
- Causal Inference for Time Series Data (with CausalML Demo)☆14Jun 11, 2023Updated 2 years ago
- Write your code as tree-like expressions, then transform it☆21Jan 9, 2024Updated 2 years ago
- ☆12Dec 13, 2023Updated 2 years ago
- This is the public repository of AAAI 2024 paper "Is a Large Language Model a Good Annotator for Event Extraction"☆10Feb 16, 2024Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [NeurIPS'24 Spotlight] Observational Scaling Laws☆60Oct 2, 2024Updated last year
- An LLM text adventure game☆21Jun 30, 2025Updated 8 months ago
- ☆22Nov 11, 2024Updated last year
- A PyTorch wrapper of parallel exclusive scan in CUDA☆12May 25, 2023Updated 2 years ago
- the Pytorch implementation for our EMNLP 2021 paper "Learning Neural Templates for Recommender Dialogue System"☆30Apr 11, 2022Updated 3 years ago
- A collection of reusable, high-performance, well-documented, thorough-tested layers and models in Jax☆23Jun 8, 2025Updated 9 months ago
- The non-user-of-rawdraw-facing side of rawdraw.☆12Jan 12, 2021Updated 5 years ago
- Lossless normalization of uppercase characters☆11Jul 3, 2023Updated 2 years ago
- ☆23Aug 7, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆12Nov 6, 2023Updated 2 years ago
- Network Etiquette (Netiquette) -- Written with 2020 technology in mind☆10Nov 19, 2021Updated 4 years ago
- source code of (quasi-)Givens Orthogonal Fine Tuning integrated to peft lib☆17Mar 13, 2025Updated last year
- Direct Preference Optimization for RWKV, aiming for RWKV-5 and 6.☆11Mar 1, 2024Updated 2 years ago
- Training a BERT model from scratch.☆11Oct 15, 2023Updated 2 years ago
- This project combines logistic regression, gradient boosting, and LSTMs to predict next-month returns.☆13Sep 25, 2019Updated 6 years ago
- Minimalist RSS/Atom aggregator 📰☆23Oct 11, 2023Updated 2 years ago
- ChatGPT solutions for the MLE interview☆14Dec 9, 2022Updated 3 years ago
- ☆59May 7, 2025Updated 10 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- this project is developing to crawl stock A finance and trade data from website, process finance and trade data to get factors, and then …☆17Jan 12, 2023Updated 3 years ago
- JAX implementations of RWKV☆19Sep 26, 2023Updated 2 years ago
- Go port to plan9/arm64☆18Mar 11, 2025Updated last year
- Hacks to run proprietary NVIDIA drivers on musl systems☆17Oct 6, 2021Updated 4 years ago
- Exploring Causal Inferences in Finance with Graph Neural Networks☆17Nov 10, 2023Updated 2 years ago
- Application for Math formula detection in image/pdf and then recognition☆12Jan 14, 2025Updated last year
- 🧰 (Almost) unique identicons - Based on SHA-1 and inspired by folded paper.☆25Dec 4, 2021Updated 4 years ago