pytorch-profiler
☆49Jun 1, 2023Updated 2 years ago
Alternatives and similar repositories for flops-profiler
Users that are interested in flops-profiler are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for…☆154Apr 22, 2026Updated last week
- (ACL-IJCNLP 2021) Convolutions and Self-Attention: Re-interpreting Relative Positions in Pre-trained Language Models.☆21Jul 13, 2022Updated 3 years ago
- Official implementation for the paper Lancet: Accelerating Mixture-of-Experts Training via Whole Graph Computation-Communication Overlapp…☆14Nov 17, 2025Updated 5 months ago
- ☆14Apr 16, 2026Updated 2 weeks ago
- Quantized Attention on GPU☆44Nov 22, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Hi-Speed DNN Training with Espresso: Unleashing the Full Potential of Gradient Compression with Near-Optimal Usage Strategies (EuroSys '2…☆15Sep 21, 2023Updated 2 years ago
- Sequence-level 1F1B schedule for LLMs.☆37Aug 26, 2025Updated 8 months ago
- ☆18Apr 21, 2024Updated 2 years ago
- DeciMamba: Exploring the Length Extrapolation Potential of Mamba (ICLR 2025)☆32Apr 9, 2025Updated last year
- ☆45Jul 4, 2024Updated last year
- Simple arithmetic coding library in C☆15Oct 19, 2015Updated 10 years ago
- Microsoft Collective Communication Library☆66Nov 23, 2024Updated last year
- Deduplication over dis-aggregated memory for Serverless Computing☆14Mar 21, 2022Updated 4 years ago
- indesAR is an android app created using Augmented Reality tech. The app focus on using AR to reduce the gap between seller and buyer by g…☆11Nov 20, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Technical snippets related to Kinect development and image processing.☆14May 7, 2015Updated 10 years ago
- ⛔️ DEPRECATED <Please refer to https://github.com/nmsl-nthu/PCCArena for the latest version>☆10Jun 9, 2022Updated 3 years ago
- Nsight Compute In Docker☆13Dec 21, 2023Updated 2 years ago
- Estimating neural network runtime characteristics☆12Mar 25, 2023Updated 3 years ago
- torch_quantizer is a out-of-box quantization tool for PyTorch models on CUDA backend, specially optimized for Diffusion Models.☆25Mar 29, 2024Updated 2 years ago
- A library to analyze PyTorch traces.☆510Apr 22, 2026Updated last week
- A metric to evaluate geometry distortions in decoded point clouds☆13Aug 7, 2023Updated 2 years ago
- ☆15Apr 18, 2023Updated 3 years ago
- Official Pytorch Implementation of Length-Adaptive Transformer (ACL 2021)☆102Nov 2, 2020Updated 5 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- 🌈 The Bangumi extension for VSCode. Her data source came from Bilibili. [Maintenance phase]☆12Oct 7, 2023Updated 2 years ago
- ROS driver for the Basler ToF ES camera☆12Nov 16, 2020Updated 5 years ago
- 基于AnimeGAN2+serverless+NAS存储的漫画风图片生成工具(demo 已失效)☆12May 11, 2022Updated 3 years ago
- ☆33Jan 6, 2025Updated last year
- Torch Distributed Experimental☆117Aug 5, 2024Updated last year
- ☆17Jul 24, 2023Updated 2 years ago
- SmartTLS is the project introduced at the paper "A Case for SmartNIC-accelerated Private Communication" (APNET 20). It accelerates web se…☆17Feb 20, 2025Updated last year
- UT Campus Object Dataset (CODa): Models for 3D Object Detection☆17Feb 4, 2025Updated last year
- Odysseus: Playground of LLM Sequence Parallelism☆78Jun 17, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- zombie game☆11Apr 19, 2019Updated 7 years ago
- Send sensor data from ARKit or ARCore to Grasshopper via wifi☆16Feb 6, 2021Updated 5 years ago
- Official Repo for "SplitQuant / LLM-PQ: Resource-Efficient LLM Offline Serving on Heterogeneous GPUs via Phase-Aware Model Partition and …☆38Aug 29, 2025Updated 8 months ago
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆43Jan 15, 2024Updated 2 years ago
- C++17 implementation of einops for libtorch - clear and reliable tensor manipulations with einstein-like notation☆11Oct 16, 2023Updated 2 years ago
- An OpenCL implementaton of Octree search☆21Feb 13, 2018Updated 8 years ago
- ☆58Updated this week