kyegomez / Blockwise-Parallel-Transformer

32 times longer context window than vanilla Transformers and up to 4 times longer than memory efficient Transformers.
43Updated last year

Related projects

Alternatives and complementary repositories for Blockwise-Parallel-Transformer