zhengzangw / Sequence-Scheduling

PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".
74Updated last year

Related projects

Alternatives and complementary repositories for Sequence-Scheduling