CatchTheTornado / text-extract-apiView on GitHub
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
2,983Dec 8, 2025Updated 2 months ago

Alternatives and similar repositories for text-extract-api

Users that are interested in text-extract-api are comparing it to the libraries listed below

Sorting:

Are these results useful?