OpenMOSE / RWKV-Infer

A large-scale RWKV v6 inference with FLA . Capable of inference by combining multiple states(Pseudo MoE). Easy to deploy on docker. Supports true multi-batch generation and dynamic State switching. CUDA and Rocm Supported :)
15Updated 2 weeks ago

Related projects: