☆27Jan 7, 2025Updated last year
Alternatives and similar repositories for ditorch
Users that are interested in ditorch are comparing it to the libraries listed below
Sorting:
- ☆13May 23, 2025Updated 9 months ago
- ☆74Oct 31, 2024Updated last year
- ☆76Nov 22, 2024Updated last year
- APRIL: Active Partial Rollouts in Reinforcement Learning to Tame Long-tail Generation. A system-level optimization for scalable LLM tra…☆51Oct 11, 2025Updated 4 months ago
- Nex Venus Communication Library☆72Nov 17, 2025Updated 3 months ago
- ☆131Nov 11, 2024Updated last year
- ☆34Feb 3, 2025Updated last year
- Prefix-Aware Attention for LLM Decoding☆27Jan 23, 2026Updated last month
- Asynchronous pipeline parallel optimization☆19Feb 2, 2026Updated 3 weeks ago
- Tutorial for Ray☆36Mar 31, 2024Updated last year
- A Really Scalable RL Framework to 10k+ CPUs☆38Feb 29, 2024Updated 2 years ago
- 🤖FFPA: Extend FlashAttention-2 with Split-D, ~O(1) SRAM complexity for large headdim, 1.8x~3x↑🎉 vs SDPA EA.☆251Feb 13, 2026Updated 2 weeks ago
- 分层解耦的深度学习推理引擎☆79Feb 17, 2025Updated last year
- tokviz is a Python library for visualizing tokenization patterns across different language models.☆12Apr 25, 2024Updated last year
- Symphony — A decentralized multi-agent framework that enables intelligent agents to collaborate seamlessly across heterogeneous edge devi…☆30Oct 30, 2025Updated 4 months ago
- A simple MIPS CPU for BUAA CO course (and now NSCSCC).☆10May 15, 2021Updated 4 years ago
- Protocol buffers and other common resources.☆13Jan 20, 2026Updated last month
- This project is based on the [LTX-Video](https://github.com/Lightricks/LTX-Video) algorithm of the diffusers and optimized and accelerate…☆12Dec 31, 2024Updated last year
- A distributed stream querying engine that provides sub-millisecond stateful query at millions of queries per-second over fast-evolving li…☆10Jul 18, 2018Updated 7 years ago
- Performance tests for multinode NGC.Ready certification☆15Jan 28, 2026Updated last month
- This is the code of a agentic rag method with dynamic workflow.☆13Jan 22, 2026Updated last month
- Zipkin client for asgi. Compatible with Starlette Framework and Jaeger tracing server☆10Apr 21, 2024Updated last year
- ☆123Updated this week
- LLM Inference via Triton (Flexible & Modular): Focused on Kernel Optimization using CUBIN binaries, Starting from gpt-oss Model☆64Oct 18, 2025Updated 4 months ago
- Efficient Long-context Language Model Training by Core Attention Disaggregation☆91Updated this week
- Guide to deploying deep-learning inference networks and deep vision primitives on SOPHON TPU.☆18Nov 14, 2025Updated 3 months ago
- 基于 Redis 官方分布式锁文章的 Python 实现☆10Jan 16, 2021Updated 5 years ago
- read source code of boltdb & re-implement it in c++☆12Jun 2, 2018Updated 7 years ago
- C# implementation of a Skype Client that allows to use Skype features in any .NET Standards based application☆13Dec 8, 2022Updated 3 years ago
- [ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models☆11Dec 13, 2023Updated 2 years ago
- Runtimex package help to expose Go Runtime internals representation safely.☆13Feb 19, 2025Updated last year
- paper and code for New Directions in Cloud Programming, CIDR 2021☆11Feb 17, 2021Updated 5 years ago
- ACM Class 2017 Computer Architecture☆10Jan 11, 2018Updated 8 years ago
- [ICML 2025] Efficiently Serving Large Multimodal Models Using EPD Disaggregation☆22May 29, 2025Updated 9 months ago
- Model explanation provides the ability to interpret the effect of the predictors on the composition of an individual score.☆13Jan 21, 2021Updated 5 years ago
- Inference deployment of the llama3☆11Apr 21, 2024Updated last year
- ☆13Jan 7, 2025Updated last year
- A simple OperatingSystem☆10Sep 9, 2022Updated 3 years ago
- istio http load balance☆10Aug 12, 2019Updated 6 years ago