Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
☆29Apr 7, 2023Updated 3 years ago
Alternatives and similar repositories for community
Users that are interested in community are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆19May 23, 2023Updated 2 years ago
- ☆207Apr 4, 2026Updated last week
- 💙 Unstructured Data Connectors for Haystack 2.0☆17Sep 21, 2023Updated 2 years ago
- a tool to snapshot sqlite databases you don't own☆24Jan 23, 2026Updated 2 months ago
- Code created for blog series on unsupervised feature/topic extraction from corporate email content. An implementation for cleaning raw e…☆10Oct 21, 2021Updated 4 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Using LLMs to manage files and generating metadata such as tags and summaries.☆17Apr 11, 2025Updated last year
- ☆903Apr 3, 2026Updated last week
- A Firefox and Google Chrome extension to clip websites and download them into a readable markdown file.☆44Jan 13, 2019Updated 7 years ago
- ☆22Mar 18, 2024Updated 2 years ago
- A prompt-engineering technique for creating personalized custom instructions on ChatGPT☆17Oct 26, 2023Updated 2 years ago
- Chrome and Firefox extensions for Slurp☆28Apr 9, 2024Updated 2 years ago
- Bindings for H3 to SQLite3☆19Feb 12, 2026Updated last month
- This guide is made to help you deploy your own document RAG pipline with Open-WebUI and Local LLM.☆38Mar 20, 2025Updated last year
- Prompt templating tools designed for interacting with language interfaces like OpenAI's ChatGPT in Obsidian.☆25Apr 3, 2024Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Demonstrate using MCP with Pydantic AI framework☆14Mar 14, 2025Updated last year
- Integrate Microsoft's Markitdown tool to convert various file formats to Markdown for your vault.☆34Mar 26, 2026Updated 2 weeks ago
- A Model Context Protocol (MCP) server that integrates with X using the @elizaOS `agent-twitter-client` package, allowing AI models to int…☆28Mar 30, 2026Updated last week
- Open Source AI Database for Voice Agent Transcripts | Call Analysis & Insights | Extraction | Labelling & Classification☆23Nov 3, 2025Updated 5 months ago
- Viewer for text datasets in formats like HuggingFace, JSONL, etc.☆15Feb 25, 2025Updated last year
- Add website scraping abilities to Datasette☆66Mar 4, 2023Updated 3 years ago
- A Python tool that uses AI to generate well-structured technical and educational articles from any topic. Features transparent reasoning,…☆18Apr 19, 2025Updated 11 months ago
- This repository hosts code for converting the original MLP Mixer models (JAX) to TensorFlow.☆15Sep 29, 2021Updated 4 years ago
- Tool4AI: A model agnostic, LLM friendly router for tool/function call☆20Aug 19, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- The official Python library for Formulaic☆18Apr 25, 2024Updated last year
- Unsupervised spoken sentence embeddings☆14Dec 14, 2022Updated 3 years ago
- This repository hosts an advanced ROS2 package designed to seamlessly integrate WebRTC into robotic applications. Its primary purpose is …☆10Nov 9, 2023Updated 2 years ago
- Materiais do Curso de Introdução à Pesquisa Jurimétrica☆12Oct 25, 2023Updated 2 years ago
- Copy the web as markdown☆41Aug 17, 2025Updated 7 months ago
- This is a template retrieval repo to create a Flask api server using LangChain with Cohere embeddings and Qdrant Vector Database☆78Apr 30, 2023Updated 2 years ago
- Build Contact Form 7 forms from PDF forms. Get PDFs auto-filled and attached to email messages and/or website responses on form submissio…☆12Apr 2, 2026Updated last week
- ☆35Jun 22, 2024Updated last year
- Injection of MSIL using Cecil☆12Jul 28, 2015Updated 10 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- SLUB Document Classification and Similarity Analysis☆10Aug 31, 2023Updated 2 years ago
- Strapi Email service provider for Postmark☆13Oct 20, 2025Updated 5 months ago
- I will be adding different kind of opensource data extraction tools code using python☆10Nov 15, 2024Updated last year
- ☆15Jun 9, 2023Updated 2 years ago
- Trained BERT and Word2Vec legal clause classifiers for SPACY using the Atticus Project's Open Source Contract Label Corpus☆13Jan 2, 2021Updated 5 years ago
- A lightweight React hook that automatically manages fade overlays for scrollable containers. Provides smooth gradient transitions at the …☆12Aug 11, 2025Updated 8 months ago
- ☆14Jul 25, 2024Updated last year