bytedance / tarsierLinks

Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.
379Updated last month

Alternatives and similar repositories for tarsier

Users that are interested in tarsier are comparing it to the libraries listed below

Sorting: