HaoKang-Timmy/LatencySensitiveBench

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/HaoKang-Timmy/LatencySensitiveBench)

HaoKang-Timmy / LatencySensitiveBench

First Latency-Aware Competitive LLM Agent Benchmark

☆32

Alternatives and similar repositories for LatencySensitiveBench

Users that are interested in LatencySensitiveBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

collaborative-agents / coco
View on GitHub
Coco is a proactive co-assistant that connects user workspace with a broader ecosystem of AI agents.
☆23Updated this week
metacarbon / shareAtt
View on GitHub
Beyond KV Caching: Shared Attention for Efficient LLMs
☆20Jul 19, 2024Updated 2 years ago
chhzh123 / ptc-tutorial
View on GitHub
PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo
☆17Mar 13, 2023Updated 3 years ago
Open-Galapagos / evolution-fine-tuning
View on GitHub
Official code, models, and dataset for "Evolution Fine-Tuning (EFT): Learning to Discover Across 371 Optimization Tasks"
☆25Jun 30, 2026Updated 3 weeks ago
fangjh21 / PALM
View on GitHub
PALM: A Efficient Performance Simulator for Tiled Accelerators with Large-scale Model Training
☆21Jun 12, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
song2yu / world-model-vla
View on GitHub
World Model & VLA Survey - Interactive Research Page
☆17May 26, 2026Updated last month
elchun / lndf_robot
View on GitHub
☆22Feb 8, 2023Updated 3 years ago
yanghr / BSQ
View on GitHub
BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization (ICLR 2021)
☆41Jan 12, 2021Updated 5 years ago
litian96 / AdaDPS
View on GitHub
Private Adaptive Optimization with Side Information (ICML '22)
☆16Jun 23, 2022Updated 4 years ago
GATECH-EIC / HALO
View on GitHub
The official code for [ECCV2020] "HALO: Hardware-aware Learning to Optimize"
☆10Mar 22, 2023Updated 3 years ago
JerryYin777 / Cross-Layer-Attention
View on GitHub
Self Reproduction Code of Paper "Reducing Transformer Key-Value Cache Size with Cross-Layer Attention (MIT CSAIL)
☆17May 24, 2024Updated 2 years ago
Shalev-Lifshitz / MultiAgentVerification
View on GitHub
Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers
☆33Mar 1, 2025Updated last year
NiuChaoyue / Secure-Federated-Submodel-Learning
View on GitHub
☆15Jul 13, 2021Updated 5 years ago
GATECH-EIC / ShiftAddViT
View on GitHub
[NeurIPS 2023] ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
☆30Dec 6, 2023Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
Terra-Flux / PolyRL
View on GitHub
[NSDI'26] PolyRL is a reinforcement learning framework for LLM that harvest spot instances on the cloud to reduce cost.
☆19Mar 30, 2026Updated 3 months ago
jallen89 / theia-cdm-samples
View on GitHub
☆11May 3, 2019Updated 7 years ago
LLMServe / FastServe
View on GitHub
☆29Sep 26, 2025Updated 10 months ago
BatsResearch / safranchik-aaai20-code
View on GitHub
☆13Jan 21, 2022Updated 4 years ago
hychen-naza / LEAP
View on GitHub
☆17Sep 28, 2023Updated 2 years ago
nyu-systems / nyu-systems-seminar
View on GitHub
The NYU Systems Seminar
☆24Feb 26, 2024Updated 2 years ago
PRIME-RL / RL-Compositionality
View on GitHub
FROM $f(x)$ AND $g(x)$ TO $f(g(x))$: LLMs Learn New Skills in RL by Composing Old Ones
☆68Jan 26, 2026Updated 6 months ago
kyegomez / MGQA
View on GitHub
The open source implementation of the multi grouped query attention by the paper "GQA: Training Generalized Multi-Query Transformer Model…
☆17Dec 11, 2023Updated 2 years ago
NVIDIA / nvidia-dlfw-inspect
View on GitHub
The tool facilitates debugging convergence issues and testing new algorithms and recipes for training LLMs using Nvidia libraries such as…
☆21Sep 17, 2025Updated 10 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
sky-lzy / Structured-4D-Model
View on GitHub
☆23Jul 2, 2026Updated 3 weeks ago
yxli2123 / LoftQ
View on GitHub
☆234Jun 11, 2024Updated 2 years ago
IST-DASLab / HALO
View on GitHub
HALO: Hadamard-Assisted Low-Precision Optimization and Training method for finetuning LLMs. 🚀 The official implementation of https://arx…
☆31Feb 17, 2025Updated last year
GATECH-EIC / S3-Router
View on GitHub
[NeurIPS 2022] "Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Spee…
☆17Sep 19, 2023Updated 2 years ago
JmlrOrg / dmlr-style-file
View on GitHub
☆12Nov 21, 2023Updated 2 years ago
merrymercy / Awesome-Efficient-LLM
View on GitHub
A curated list for Efficient Large Language Models
☆11Mar 25, 2024Updated 2 years ago
VideoVLA-Project / VideoVLA
View on GitHub
[NeurIPS2025]VideoVLA: Video Generators Can Be Generalizable Robot Manipulators
☆32Jun 26, 2026Updated last month
aravinds92 / Systolic-Array
View on GitHub
Systolic array based hardware for Image processing on the SPARTAN-6 FPGA
☆13May 26, 2016Updated 10 years ago
Egg-Hu / SMI
View on GitHub
[ICML 2024] Sparse Model Inversion: Efficient Inversion of Vision Transformers with Less Hallucination
☆14Apr 29, 2025Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
rainbow979 / replandiffuser
View on GitHub
☆25Oct 22, 2023Updated 2 years ago
Snowflake-AI-Research / Arctic-Platform
View on GitHub
Arctic Training and Inference Platform
☆57Updated this week
ljiahao / TeLL
View on GitHub
TeLL: Log Level Suggestions via Modeling Multi-Level Code Block Information, ISSTA'22
☆14Jul 14, 2022Updated 4 years ago
davidbrandfonbrener / color-filter-olmo
View on GitHub
☆13Dec 12, 2025Updated 7 months ago
huweim / dataflow_architecture
View on GitHub
Research about dataflow architecture
☆15Nov 30, 2023Updated 2 years ago
mit-han-lab / SMEPO
View on GitHub
☆16May 27, 2026Updated last month
ypwang61 / negCLIPLoss_NormSim
View on GitHub
[NeurIPS 2024 Spotlight] CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning.
☆14Dec 12, 2024Updated last year