Memory footprint reduction for transformer models
☆11Jan 24, 2023Updated 3 years ago
Alternatives and similar repositories for Tempo
Users that are interested in Tempo are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- GPU-accelerated AES encryption project☆11Feb 13, 2015Updated 11 years ago
- 面向多平台编译优化的深度学习中间表示☆10Oct 28, 2024Updated last year
- Implementation of algorithms for memory optimized deep neural network training☆10Jul 23, 2020Updated 5 years ago
- Python C++ Code Manager☆15Sep 29, 2024Updated last year
- Auto-differentiation library for C++☆12Jan 16, 2022Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Scalable radix top-k selection on GPUs.☆23Jan 27, 2025Updated last year
- [ASP-DAC 2025] "NeuronQuant: Accurate and Efficient Post-Training Quantization for Spiking Neural Networks" Official Implementation☆19Mar 6, 2025Updated last year
- [ACL 2026 🔥] CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark☆34Apr 20, 2026Updated last month
- AutodiffEngine☆13Apr 1, 2019Updated 7 years ago
- Implementation of vDNN++; an improvement over vDNN☆18Dec 7, 2018Updated 7 years ago
- ☆16Jul 29, 2025Updated 9 months ago
- Thinking is hard - automate it☆18Aug 24, 2022Updated 3 years ago
- Lecture notes at SJTU☆40Feb 12, 2021Updated 5 years ago
- [EuroSys'25] Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization☆23Apr 13, 2026Updated last month
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- The official implementation of HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization☆19Mar 7, 2025Updated last year
- TiledKernel is a code generation library based on macro kernels and memory hierarchy graph data structure.☆19May 12, 2024Updated 2 years ago
- [NeurIPS 2024 D&B Track] DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation☆13Mar 5, 2025Updated last year
- The only known (by 2022) open-source, easy-to-understand basic algorithm implementations in TD-CEM. (Please star and fork this project if…☆15Mar 1, 2022Updated 4 years ago
- Code for COLING 2022 paper "FactMix: Using a Few Labeled In-domain Examples to Generalize to Cross-domain Named Entity Recognition"☆15Jan 15, 2023Updated 3 years ago
- ☆12May 12, 2023Updated 3 years ago
- SJTU CS473 Project: Implementation of Deep Closest Point in TensorFlow, and its comparison with other registration methods.☆12Jun 14, 2020Updated 5 years ago
- A simple tool for parsing the profile.json file of mxnet☆14Aug 1, 2018Updated 7 years ago
- This repository compiles a list of papers/resources related to the graph retrieval-augmented generation! Star⭐ the repo and follow me if …☆10Dec 7, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Deploy and Scale LLM-based applications☆26Jun 15, 2023Updated 2 years ago
- Retrieval-augmented Image Captioning☆13Feb 16, 2023Updated 3 years ago
- ☆22Jul 11, 2023Updated 2 years ago
- ☆12Mar 4, 2022Updated 4 years ago
- Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Juli…☆10Aug 7, 2020Updated 5 years ago
- Code for Semi-crowdsourced Clustering with Deep Generative Models☆12Dec 9, 2022Updated 3 years ago
- ☆10Nov 26, 2024Updated last year
- ☆13Mar 7, 2022Updated 4 years ago
- Deep exponential family models in MXNet/Gluon. Layers o' latents 💤☆17Oct 16, 2017Updated 8 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Source code for the CPU-Free model - a fully autonomous execution model for multi-GPU applications that completely excludes the involveme…☆21Apr 25, 2024Updated 2 years ago
- R-LPIPS [ICML W 2023]☆17Nov 14, 2023Updated 2 years ago
- CAKE Library for constant-bandwidth matrix multiplication on CPUs☆14Apr 6, 2024Updated 2 years ago
- Pytorch implementation of "Very Deep Graph Neural Networks via Noise Regularisation"☆10Aug 22, 2021Updated 4 years ago
- ☆20Nov 7, 2019Updated 6 years ago
- make OpenWrt Router can use iPhone's net withusb☆15May 5, 2019Updated 7 years ago
- Reading seminar in Harvard Cloud Networking and Systems Group☆16Aug 29, 2022Updated 3 years ago