alexpashevich / E.T.

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.
87Updated last year

Related projects

Alternatives and complementary repositories for E.T.