ml6team / fondantLinks
Production-ready data processing made easy and shareable
☆353Updated last year
Alternatives and similar repositories for fondant
Users that are interested in fondant are comparing it to the libraries listed below
Sorting:
- ☆197Updated last year
- [WIP] A 🔥 interface for running code in the cloud☆85Updated 2 years ago
- ☆50Updated last year
- Drop in replacement for OpenAI, but with Open models.☆153Updated 2 years ago
- Domain Adapted Language Modeling Toolkit - E2E RAG☆328Updated 11 months ago
- Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines☆195Updated last year
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…☆121Updated 2 years ago
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…☆169Updated last year
- Easily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...☆318Updated last year
- Let's build better datasets, together!☆262Updated 9 months ago
- ☆124Updated 11 months ago
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆159Updated last year
- Small finetuned LLMs for a diverse set of useful tasks☆127Updated 2 years ago
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆231Updated 11 months ago
- Reimplementation of the task generation part from the Alpaca paper☆118Updated 2 years ago
- Smol but mighty language model☆61Updated 2 years ago
- Understanding large language models☆119Updated 2 years ago
- data cleaning and curation for unstructured text☆328Updated last year
- Command-line script for inferencing from models such as MPT-7B-Chat☆99Updated 2 years ago
- This repository implements the idea of "caption upsampling" from DALL-E 3 with Zephyr-7B and gathers results with SDXL.☆157Updated last year
- AI Data Management & Evaluation Platform☆216Updated 2 years ago
- git extension for {collaborative, communal, continual} model development☆215Updated 10 months ago
- ☆463Updated last year
- Landmark Attention: Random-Access Infinite Context Length for Transformers QLoRA☆123Updated 2 years ago
- batched loras☆346Updated 2 years ago
- 📚 Datasets and models for instruction-tuning☆239Updated 2 years ago
- ☆95Updated 2 years ago
- Full finetuning of large language models without large memory requirements☆93Updated 2 weeks ago
- Used for adaptive human in the loop evaluation of language and embedding models.☆308Updated 2 years ago
- 🤖 A PyTorch library of curated Transformer models and their composable components☆893Updated last year