FasterDecoding / Medusa

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
2,312Updated 4 months ago

Related projects

Alternatives and complementary repositories for Medusa