unum-cloud / uform
Pocket-Sized Multimodal AI for content understanding and generation across multilingual texts, images, and π video, up to 5x faster than OpenAI CLIP and LLaVA πΌοΈ & ποΈ
β1,019Updated 3 months ago
Related projects: β
- β694Updated 6 months ago
- Fast Open-Source Search & Clustering engine Γ for Vectors & π Strings Γ in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, Cβ¦β2,121Updated this week
- Fine-tune mistral-7B on 3090s, a100s, h100sβ701Updated 11 months ago
- ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Expertβ¦β1,169Updated 2 weeks ago
- YaRN: Efficient Context Window Extension of Large Language Modelsβ1,308Updated 5 months ago
- 4M: Massively Multimodal Masked Modelingβ1,543Updated 2 months ago
- DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. β π€π€β797Updated last month
- Implementation of the training framework proposed in Self-Rewarding Language Model, from MetaAIβ1,309Updated 5 months ago
- CLIP inference in plain C/C++ with no extra dependenciesβ433Updated last month
- [ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddingsβ1,838Updated 3 weeks ago
- Fast, Accurate, Lightweight Python library to make State of the Art Embeddingβ1,371Updated this week
- streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, Phi-3.5 Visionβ1,285Updated this week
- Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"β1,049Updated 6 months ago
- Curate better data for LLMsβ934Updated 6 months ago
- A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.β845Updated last week
- The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".β1,294Updated 8 months ago
- Multi-LoRA inference server that scales to 1000s of fine-tuned LLMsβ2,087Updated this week
- LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructionsβ810Updated last year
- β544Updated this week
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifiβ¦β1,408Updated this week
- TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbonesβ1,235Updated 5 months ago
- An Open-source Toolkit for LLM Developmentβ2,687Updated 3 months ago
- A FastAPI service for semantic text search using precomputed embeddings and advanced similarity measures, with built-in support for varioβ¦β921Updated last week
- An open-source framework for training large multimodal models.β3,659Updated 2 weeks ago
- π€ A PyTorch library of curated Transformer models and their composable componentsβ861Updated 5 months ago
- Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'β1,137Updated this week
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.β1,935Updated last week
- Code for fine-tuning Platypus fam LLMs using LoRAβ625Updated 7 months ago
- Automatically create Faiss knn indices with the most optimal similarity search parameters.β802Updated 3 months ago
- S-LoRA: Serving Thousands of Concurrent LoRA Adaptersβ1,702Updated 7 months ago