GeeeekExplorer / cupytorchLinks
A small framework mimics PyTorch using CuPy or NumPy
☆41Updated 3 years ago
Alternatives and similar repositories for cupytorch
Users that are interested in cupytorch are comparing it to the libraries listed below
Sorting:
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆40Updated last year
- Distributed DataLoader For Pytorch Based On Ray☆24Updated 3 years ago
- This repository contains the code for the paper in Findings of EMNLP 2021: "EfficientBERT: Progressively Searching Multilayer Perceptron …☆33Updated 2 years ago
- Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI☆57Updated last year
- Contextual Position Encoding but with some custom CUDA Kernels https://arxiv.org/abs/2405.18719☆22Updated last year
- A *tuned* minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training☆115Updated 3 years ago
- A Tight-fisted Optimizer☆48Updated 2 years ago
- ☆22Updated last year
- patches for huggingface transformers to save memory☆26Updated last month
- Code for the paper "Query-Key Normalization for Transformers"☆43Updated 4 years ago
- Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).☆25Updated last year
- ICLR 2021 Stats & Graphs☆31Updated 3 years ago
- 逻辑回归和单层softmax的解析解☆12Updated 3 years ago
- ☆27Updated this week
- A tiny, didactical implementation of LLAMA 3☆41Updated 7 months ago
- An object detection codebase based on MegEngine.☆28Updated 2 years ago
- ☆53Updated this week
- The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Natu…☆48Updated 4 years ago
- Notes of my introduction about NLP in Fudan University☆37Updated 4 years ago
- ☆31Updated last year
- ☆14Updated last year
- The accompanying code for "Memory-efficient Transformers via Top-k Attention" (Ankit Gupta, Guy Dar, Shaya Goodman, David Ciprut, Jonatha…☆69Updated 3 years ago
- A custom pytorch Dataset extension that provides a faster iteration and better RAM usage☆44Updated last year
- Code for paper "Patch-Level Training for Large Language Models"☆85Updated 8 months ago
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆58Updated last year
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆30Updated 2 weeks ago
- ☆20Updated last year
- ☆14Updated 2 years ago
- Various test models in WNNX format. It can view with `pip install wnetron && wnetron`☆12Updated 3 years ago
- triton ver of gqa flash attn, based on the tutorial☆11Updated 11 months ago