A Learnable LSH Framework for Efficient NN Training
☆34Jul 22, 2021Updated 4 years ago
Alternatives and similar repositories for mongoose
Users that are interested in mongoose are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Locality sensitive hash functions for Tensorflow 2.0.☆12Feb 18, 2022Updated 4 years ago
- ☆15Jan 7, 2022Updated 4 years ago
- A Sparse-tensor Communication Framework for Distributed Deep Learning☆13Nov 1, 2021Updated 4 years ago
- A compressed adaptive optimizer for training large-scale deep learning models using PyTorch☆25Nov 26, 2019Updated 6 years ago
- ☆30Oct 8, 2025Updated 7 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆11Apr 3, 2023Updated 3 years ago
- Measuring the Signal to Noise Ratio in Language Model Evaluation☆29Aug 19, 2025Updated 9 months ago
- ☆21Mar 7, 2024Updated 2 years ago
- FPGA-based HyperLogLog Accelerator☆12Jul 13, 2020Updated 5 years ago
- ☆44Mar 29, 2023Updated 3 years ago
- Proximal Asynchronous SAGA☆13Nov 30, 2017Updated 8 years ago
- Implementation for ACProp ( Momentum centering and asynchronous update for adaptive gradient methdos, NeurIPS 2021)☆17Oct 11, 2021Updated 4 years ago
- A2C training of Relational Deep Reinforcement Learning Architecture☆13Jun 22, 2022Updated 3 years ago
- A Suite for Parallel Inference of Diffusion Transformers (DiTs) on multi-GPU Clusters☆58May 3, 2026Updated 3 weeks ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ArcherCodeR is an open-source initiative enhancing code reasoning in large language models through scalable, rule-governed reinforcement …☆44Aug 6, 2025Updated 9 months ago
- Code repository for "Spatiotemporal Traffic Matrix Synthesis", Paul Tune and Matthew Roughan, ACM SIGCOMM 2015, London, UK, August 2015.☆15Jan 13, 2016Updated 10 years ago
- ☆11Jun 29, 2021Updated 4 years ago
- Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"☆1,104Apr 13, 2021Updated 5 years ago
- ☆16May 11, 2017Updated 9 years ago
- SmartNIC☆14Dec 13, 2018Updated 7 years ago
- TiledLower is a Dataflow Analysis and Codegen Framework written in Rust.☆13Nov 23, 2024Updated last year
- A source-to-source compiler for optimizing CUDA dynamic parallelism by aggregating launches☆15Jun 21, 2019Updated 6 years ago
- ☆19Sep 10, 2019Updated 6 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Sketched SGD☆28Jul 4, 2020Updated 5 years ago
- bigcomputing☆33Nov 3, 2020Updated 5 years ago
- PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo☆17Mar 13, 2023Updated 3 years ago
- Aioli: A unified optimization framework for language model data mixing☆32Jan 17, 2025Updated last year
- Manages vllm-nccl dependency☆18Jun 3, 2024Updated last year
- [ICML 2024 Oral] LSH-Based Efficient Point Transformer (HEPT)☆26Jan 24, 2025Updated last year
- ☆45Apr 30, 2018Updated 8 years ago
- A GPU-accelerated DNN inference serving system that supports instant kernel preemption and biased concurrent execution in GPU scheduling.☆43May 29, 2022Updated 3 years ago
- Repo for WWW 2022 paper: Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval☆16Mar 1, 2022Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆11Dec 8, 2022Updated 3 years ago
- Ok-Topk is a scheme for distributed training with sparse gradients. Ok-Topk integrates a novel sparse allreduce algorithm (less than 6k c…☆27Dec 10, 2022Updated 3 years ago
- Beyond KV Caching: Shared Attention for Efficient LLMs☆20Jul 19, 2024Updated last year
- Efficient Neural Interaction Functions Search for Collaborative Filtering☆18Feb 15, 2020Updated 6 years ago
- SUSTech 2023 Spring CS328 Distributed System☆18Oct 5, 2024Updated last year
- A Collection of Papers on Diffusion Large Language Models☆47May 12, 2026Updated last week
- Code for paper 'Minimizing FLOPs to Learn Efficient Sparse Representations' published at ICLR 2020☆20Feb 14, 2020Updated 6 years ago