Toseic / LLM-inference-arxiv-dailyView external linksLinks
🎓Automatically Update LLM inference systems Papers Daily using Github Actions (Update Every 12th hours)
☆12Updated this week
Alternatives and similar repositories for LLM-inference-arxiv-daily
Users that are interested in LLM-inference-arxiv-daily are comparing it to the libraries listed below
Sorting:
- Asynchronous pipeline parallel optimization☆19Feb 2, 2026Updated 2 weeks ago
- ☆13May 16, 2019Updated 6 years ago
- ☆14Jun 10, 2025Updated 8 months ago
- Work in progress to create an RFC that documents the OpenVPN protocol☆14Nov 24, 2025Updated 2 months ago
- Awesome latest models, datasets and benchmarks on streaming/online video understanding.☆24Oct 19, 2025Updated 3 months ago
- Official codebase for our paper "Joslim: Joint Widths and Weights Optimization for Slimmable Neural Networks"☆12Jun 30, 2021Updated 4 years ago
- ☆10Nov 18, 2024Updated last year
- High performance RMSNorm Implement by using SM Core Storage(Registers and Shared Memory)☆26Jan 22, 2026Updated 3 weeks ago
- Cute layout visualization☆30Jan 18, 2026Updated 3 weeks ago
- Expert Specialization MoE Solution based on CUTLASS☆27Jan 19, 2026Updated 3 weeks ago
- Website for CSE 234, Winter 2025☆13Mar 24, 2025Updated 10 months ago
- Trusted Mamba Contrastive Network for Multi-View Clustering☆16Dec 10, 2025Updated 2 months ago
- ☆12Jul 7, 2021Updated 4 years ago
- ☆11Feb 19, 2021Updated 4 years ago
- Implementation of "DIME-FM: DIstilling Multimodal and Efficient Foundation Models"☆15Oct 12, 2023Updated 2 years ago
- This is the code for the paper published in IEEE Cloud Computing 2022☆12Jul 22, 2022Updated 3 years ago
- A queue implemented using shared memory. 使用共享内存实现的队列☆13Aug 15, 2022Updated 3 years ago
- Code for the SIGMOD 2023 paper "SSIN: Self-Supervised Learning for Rainfall Spatial Interpolation".☆14Feb 3, 2024Updated 2 years ago
- A professional list of Papers on AI for Spatial Interpolation in AI conferences and journals.☆12Jul 29, 2024Updated last year
- This repository contains a list of papers on spatio-temporal graph, especially about GNNs on S-T graph.☆17Sep 8, 2023Updated 2 years ago
- Reading notes on Speculative Decoding papers☆21Dec 8, 2025Updated 2 months ago
- 西安电子科技大学毕业论文Typst模板☆13May 4, 2025Updated 9 months ago
- ☆12Aug 20, 2025Updated 5 months ago
- ☆13Jan 14, 2020Updated 6 years ago
- DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling☆22Feb 9, 2026Updated last week
- DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting☆17Mar 4, 2025Updated 11 months ago
- This project includes a simulator and workload generator for Edge-to-Cloud environments. Users can implement different scenarios, includi…☆15Aug 7, 2024Updated last year
- Wave: Python Domain-Specific Language for High Performance Machine Learning☆44Updated this week
- Summary of the Specs of Commonly Used GPUs for Training and Inference of LLM☆75Aug 12, 2025Updated 6 months ago
- ☆14Oct 4, 2021Updated 4 years ago
- Triton Implementation of Flash Attention with Bias.☆20Apr 16, 2025Updated 10 months ago
- ☆18Oct 15, 2024Updated last year
- An Automatic Synthesis Tool for PIM-based CNN Accelerators.☆16Feb 29, 2024Updated last year
- Awesome-Parallel-Reasoning: Unlocking the reasoning potential of LLMs. Papers, Code, Resources & Survey.☆48Jan 6, 2026Updated last month
- YOLOv3-RepVGG-backbone☆15Apr 25, 2021Updated 4 years ago
- ☆14Jan 28, 2026Updated 2 weeks ago
- AI model training on heterogeneous, geo-distributed resources☆35Nov 24, 2025Updated 2 months ago
- ☆31Updated this week
- Vortex: A Flexible and Efficient Sparse Attention Framework☆46Jan 21, 2026Updated 3 weeks ago