unum-cloud / uform
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and π video, up to 5x faster than OpenAI CLIP and LLaVA πΌοΈ & ποΈ
β1,098Updated 2 months ago
Alternatives and similar repositories for uform:
Users that are interested in uform are comparing it to the libraries listed below
- Fast Open-Source Search & Clustering engine Γ for Vectors & π Strings Γ in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, Cβ¦β2,581Updated last month
- β707Updated last year
- CLIP inference in plain C/C++ with no extra dependenciesβ485Updated 6 months ago
- Automatically create Faiss knn indices with the most optimal similarity search parameters.β841Updated 9 months ago
- Accelerate your Hugging Face Transformers 7.6-9x. Native to Hugging Face and PyTorch.β686Updated 6 months ago
- Fine-tune mistral-7B on 3090s, a100s, h100sβ706Updated last year
- ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Expertβ¦β1,361Updated 3 months ago
- C++ implementation for BLOOMβ810Updated last year
- [ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddingsβ1,920Updated last month
- WhisperFusion builds upon the capabilities of WhisperLive and WhisperSpeech to provide a seamless conversations with an AI.β1,584Updated 7 months ago
- This repository contains the official implementation of the research paper, "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinfβ¦β844Updated 3 months ago
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAIβ1,367Updated 11 months ago
- Neural Searchβ351Updated this week
- β1,273Updated last year
- MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.β935Updated this week
- π Accelerate inference and training of π€ Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimizationβ¦β2,805Updated this week
- LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructionsβ820Updated last year
- llama.cpp with BakLLaVA model describes what does it seeβ384Updated last year
- MINT-1T: A one trillion token multimodal interleaved dataset.β801Updated 7 months ago
- Pybind11 bindings for Whisper.cppβ328Updated 3 months ago
- β1,025Updated last year
- Train Models Contrastively in Pytorchβ658Updated 3 weeks ago
- MultimodalC4 is a multimodal extension of c4 that interleaves millions of images with text.β918Updated 9 months ago
- A minimal Python package for storing and retrieving text using chunking, embeddings, and vector search.β695Updated 5 months ago
- Fast, Accurate, Lightweight Python library to make State of the Art Embeddingβ1,844Updated this week
- Whisper with Medusa headsβ823Updated 2 weeks ago
- 4M: Massively Multimodal Masked Modelingβ1,691Updated this week
- Training LLMs with QLoRA + FSDPβ1,458Updated 4 months ago
- Blazing fast framework for fine-tuning similarity learning modelsβ656Updated 2 months ago
- Run inference on MPT-30B using CPUβ575Updated last year