alexpashevich / E.T.

Episodic Transformer (E.T.) is a novel attention-based architecture for vision-and-language navigation. E.T. is based on a multimodal transformer that encodes language inputs and the full episode history of visual observations and actions.
90Updated last year

Alternatives and similar repositories for E.T.:

Users that are interested in E.T. are comparing it to the libraries listed below