Spawning-Inc / datadiligenceLinks
Respect generative AI opt-outs in your ML training pipeline.
☆39Updated last year
Alternatives and similar repositories for datadiligence
Users that are interested in datadiligence are comparing it to the libraries listed below
Sorting:
- A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.☆32Updated 2 years ago
- A library for detecting problematic data segments in structured and unstructured data with few lines of code.☆64Updated 2 years ago
- 🤗 Optimum ONNX: Export your model to ONNX and run inference with ONNX Runtime☆114Updated last week
- Blueprint to Build Your Own Timeline Algorithm☆72Updated 4 months ago
- Lightweight tools for quick and easy LLM demo's☆28Updated last year
- ☆42Updated last year
- ☆50Updated 3 months ago
- Benchmark scripts for comparing different tokenizers and sentence segmenters of German☆12Updated 2 years ago
- ☆52Updated 2 years ago
- Converts JSON-Schema to GBNF grammar to use with llama.cpp☆55Updated 2 years ago
- BlinkDL's RWKV-v4 running in the browser☆48Updated 2 years ago
- Tracking instruction-tuned LLM openness. Paper: Liesenfeld, Andreas, Alianda Lopez, and Mark Dingemanse. 2023. “Opening up ChatGPT: Track…☆119Updated 11 months ago
- 🦄 An NLP application just for the lols: built with Haystack to get an overview of what a user is posting about on Twitter☆46Updated 2 years ago
- webgpu autograd library☆33Updated 8 months ago
- Completion After Prompt Probability. Make your LLM make a choice☆82Updated last year
- Pretraining data reconstruction scripts for Apertus☆113Updated 3 months ago
- ☆64Updated 2 years ago
- GPU accelerated client-side embeddings for vector search, RAG etc.☆65Updated 2 years ago
- Drop in replacement for OpenAI, but with Open models.☆156Updated 2 years ago
- Gradio Client in Rust.☆28Updated 2 months ago
- ☆13Updated 2 years ago
- Unofficial python bindings for the rust llm library. 🐍❤️🦀☆76Updated 2 years ago
- ☆23Updated last year
- Run Vision LLMs, TTS and STT APIs. Website and API for https://text-generator.io☆39Updated this week
- assign color hues to a collection of text fragments based on embeddings☆20Updated last year
- [WIP] A 🔥 interface for running code in the cloud☆86Updated 2 years ago
- GitHub action that'll sync files from a GitHub Repo with the Hugging Face Hub 🤗☆79Updated last year
- Code repo for "Most Language Models can be Poets too: An AI Writing Assistant and Constrained Text Generation Studio" at the (CAI2) wor…☆214Updated 2 years ago
- Google Colab Notebooks for Transcription with Whisper☆25Updated 9 months ago
- Production-ready data processing made easy and shareable☆358Updated last year