Document Filters is an SDK for applications like content indexing, e-discovery, data migration, and feeding data into AI/ML models by extracting data from unstructured sources. It gives the ability to perform deep inspection, data extraction, output manipulation, and conversion for virtually any type of document, in any programming language.
☆26May 13, 2026Updated last month
Alternatives and similar repositories for DocumentFilters
Users that are interested in DocumentFilters are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ZSH plugin for MySQL.☆17Oct 1, 2022Updated 3 years ago
- Interactive git commands using fzf, available as zsh plugin☆18May 5, 2026Updated last month
- ZSH plugin who create file from template☆15Apr 24, 2020Updated 6 years ago
- Fork from https://github.com/robbyrussell/oh-my-zsh/blob/master/plugins/colored-man-pages/colored-man-pages.plugin.zsh☆17Dec 22, 2016Updated 9 years ago
- Custom zsh plugin to create custom plugins☆13May 27, 2021Updated 5 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Imports highlights from Shortform.com to Readwise.io.☆17Oct 10, 2021Updated 4 years ago
- ☆18Jan 21, 2016Updated 10 years ago
- Zsh plugin to add safe-rm functionality so that `rm` will put files in the trash.☆28Nov 25, 2024Updated last year
- sync hypothesis <-> zotero☆22Apr 6, 2020Updated 6 years ago
- A monolithic index that supports worst-case optimal joins (WCOJ) by providing all collation orders in a single redundancy eliminating dat…☆18Sep 18, 2025Updated 8 months ago
- Custom shell (sh, bash, zsh) plugins☆31Apr 7, 2026Updated 2 months ago
- mReasoner is a unified computational implementation of the model theory of thinking and reasoning☆15Aug 17, 2023Updated 2 years ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Aug 12, 2023Updated 2 years ago
- Obsidian Plugin that adds the the markdown title within your notes to the file explorer☆30Feb 28, 2023Updated 3 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Wikimedia Enterprise - client SDK in Python☆22May 4, 2026Updated last month
- Blazing fast signature detection☆11Sep 5, 2022Updated 3 years ago
- Export/access your Hypothes.is data: annotations and profile info☆47Jul 15, 2025Updated 11 months ago
- Formula to detect the ease of reading a text according to the Coleman-Liau index (1975)☆14Nov 1, 2022Updated 3 years ago
- Run greatexpectations.io on ANY SQL Engine using REST API. Supported by FastAPI, Pydantic and SQLAlchemy as best data quality tool☆14Dec 12, 2025Updated 6 months ago
- Via Text Density Simple Web Crawler With Go☆13Mar 19, 2023Updated 3 years ago
- C4RepSet: Representative Subset from C4 data for Training Pre-trained LMs☆11Jan 13, 2023Updated 3 years ago
- A Julia library for working with Data Package.☆11Aug 10, 2021Updated 4 years ago
- Dataset from Tip of the Tongue Known-Item Retrieval (2021) paper.☆12Nov 4, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- prevent XSS attacks by sanitizing html (this is different then escaping!)☆22Oct 14, 2023Updated 2 years ago
- Code that drives the public web-based tools for the Media Cloud Online News Archive and Directory.☆12Updated this week
- A UI designer for constructing AI applications with OpenSearch☆16Updated this week
- Scala implementations of standard algorithms for Multi-Armed Bandits Problem.☆11May 7, 2016Updated 10 years ago
- Encryption which converts English characters to unicode characters that mimicking their appearance☆12Sep 17, 2017Updated 8 years ago
- How to backdoor Diffie-Hellman, lessons learned from the Socat non-prime prime☆11Jun 29, 2021Updated 4 years ago
- Temporal and Causal Reasoning (dataset)☆10Apr 19, 2022Updated 4 years ago
- Xayn AI☆18May 9, 2022Updated 4 years ago
- The inverted index exchange format as defined as part of the Open-Source IR Replicability Challenge (OSIRRC) initiative☆11Aug 6, 2025Updated 10 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- LLM Oracle is a GPT-4 powered tool for predicting future events. It's like a Magic 8 Ball that is able to perform basic research, calcula…☆17May 27, 2023Updated 3 years ago
- Security research organization dedicated to finding low hanging, critical, vulnerabilities.☆15May 12, 2022Updated 4 years ago
- A service to auto-hide Hacker News articles by keyword, site, and more☆12Oct 12, 2025Updated 8 months ago
- R library for common information retrieval metrics☆14Jun 5, 2023Updated 3 years ago
- ☆11Updated this week
- High-performance MCP server, code graph engine & evolutionary algorithm platform in Zig. 33 tools: GitHub project management, agent swarm…☆59Apr 19, 2026Updated last month
- I moved this folder. Keeping this repo up for archival purposes only.☆17Jun 5, 2024Updated 2 years ago