idoheinemann / Assembly-Neural-NetworkLinks
A multy-layer feed-forward neural network implementation in assembly x86 32 bits
☆20Updated 6 years ago
Alternatives and similar repositories for Assembly-Neural-Network
Users that are interested in Assembly-Neural-Network are comparing it to the libraries listed below
Sorting:
- Code implementation from my blog post: https://fkodom.substack.com/p/transformers-from-scratch-in-pytorch☆97Updated 2 years ago
- My own repository containing the codes I wrote to practice CUDA programming.☆65Updated 2 years ago
- NNCG: A Neural Network Code Generator☆36Updated last year
- Implements a LLM similar to Meta's Llama 2 from the ground up in PyTorch, for educational purposes.☆38Updated 11 months ago
- An implementation of the transformer architecture onto an Nvidia CUDA kernel☆202Updated 2 years ago
- Neural network from scratch in CUDA/C++☆88Updated 4 months ago
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆86Updated 2 years ago
- Visualising Losses in Deep Neural Networks☆16Updated last year
- Evaluation Code repository for the paper "ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers". (2023…☆13Updated 2 years ago
- A really tiny autograd engine☆99Updated 8 months ago
- Pytorch Implementation of Transformers Explained with Comments☆16Updated 5 years ago
- PyTorch implementation of Hinton's FF Algorithm with hard negatives sampling☆15Updated 3 years ago
- ☆30Updated 3 years ago
- Deep learning library implemented from scratch in numpy. Mixtral, Mamba, LLaMA, GPT, ResNet, and other experiments.☆53Updated last year
- A codebase implementing a simple GPT-like model from scratch based on the Attention is All You Need paper.☆71Updated 2 years ago
- Implementation of the specific Transformer architecture from PaLM - Scaling Language Modeling with Pathways - in Jax (Equinox framework)☆190Updated 3 years ago
- Inference Llama 2 in one file of pure C++☆87Updated 2 years ago
- ctypes wrappers for HIP, CUDA, and OpenCL☆130Updated last year
- ☆129Updated last year
- Convolutional Neural Network implemented from Scratch for MNIST and CIFAR-10 datasets.☆66Updated 3 years ago
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…☆92Updated 2 years ago
- ☆10Updated last year
- ☆70Updated last year
- An implementation of the base GPT-3 Model architecture from the paper by OPENAI "Language Models are Few-Shot Learners"☆20Updated last year
- Hinton's Forward-Forward Algorithm for Deep Learning☆10Updated 2 years ago
- Step by step explanation/tutorial of llama2.c☆225Updated 2 years ago
- Template repo for Python projects, especially those focusing on machine learning and/or deep learning.☆15Updated 2 weeks ago
- A simplistic linear and multiprocessed approach to sentiment analysis using Gzip Normalized Compression Distances with k nearest neighbor…☆144Updated 2 years ago
- ☆18Updated 2 months ago
- OSLO: Open Source for Large-scale Optimization☆175Updated 2 years ago