ml6team / fondantLinks
Production-ready data processing made easy and shareable
β351Updated last year
Alternatives and similar repositories for fondant
Users that are interested in fondant are comparing it to the libraries listed below
Sorting:
- β198Updated last year
- [WIP] A π₯ interface for running code in the cloudβ85Updated 2 years ago
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 linesβ197Updated last year
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.β157Updated last year
- Domain Adapted Language Modeling Toolkit - E2E RAGβ322Updated 6 months ago
- Easily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...β316Updated last year
- Diffusers-Interpret π€π§¨π΅οΈββοΈ: Model explainability for π€ Diffusers. Get explanations for your generated images.β276Updated 2 years ago
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).β¦β121Updated 2 years ago
- Reimplementation of the task generation part from the Alpaca paperβ119Updated 2 years ago
- π€ A PyTorch library of curated Transformer models and their composable componentsβ890Updated last year
- β303Updated 11 months ago
- Let's build better datasets, together!β259Updated 5 months ago
- β170Updated last year
- A Simple Bulk Labelling Toolβ585Updated 5 months ago
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for freeβ231Updated 7 months ago
- Understanding large language modelsβ116Updated 2 years ago
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching oβ¦β136Updated 2 weeks ago
- β123Updated 7 months ago
- Full finetuning of large language models without large memory requirementsβ93Updated last year
- Train vision models using JAX and π€ transformersβ97Updated last month
- Exploring finetuning public checkpoints on filter 8K sequences on Pileβ114Updated 2 years ago
- Directly Connecting Python to LLMs via Strongly-Typed Functions, Dataclasses, Interfaces & Generic Typesβ399Updated 3 months ago
- Maybe the new state of the art vision model? we'll see π€·ββοΈβ163Updated last year
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAIβ222Updated last year
- Toolkit for attaching, training, saving and loading of new heads for transformer modelsβ279Updated 3 months ago
- π€ Disaggregators: Curated data labelers for in-depth analysis.β66Updated 2 years ago
- This repository implements the idea of "caption upsampling" from DALL-E 3 with Zephyr-7B and gathers results with SDXL.β152Updated last year
- TitanML Takeoff Server is an optimization, compression and deployment platform that makes state of the art machine learning models accessβ¦β114Updated last year
- π Datasets and models for instruction-tuningβ238Updated last year
- FastFit β‘ When LLMs are Unfit Use FastFit β‘ Fast and Effective Text Classification with Many Classesβ206Updated 3 weeks ago