soham-1 / fastapi_pdfextractor
An api using fastapi for extracting the text content of pdf using pdfminer. It also supports scanned images in pdf's by using tesseract and ocrmypdf.
☆15Updated 3 years ago
Alternatives and similar repositories for fastapi_pdfextractor:
Users that are interested in fastapi_pdfextractor are comparing it to the libraries listed below
- Code examples on how to integrate various types of scrapers with Scraper API.☆29Updated 3 years ago
- Pipeline for converting PDFs to raw text with PaddleOCR☆23Updated last year
- Redis Queue Dashboard based on FastAPI☆100Updated 3 months ago
- FastAPI with Docker and Traefik☆110Updated 2 years ago
- FastAPI Async MongoDB Boiler Plate RestAPI☆39Updated last year
- The faststream-gen library uses advanced AI to generate FastStream code from user descriptions, speeding up FastStream app development.☆47Updated last year
- ☆23Updated 3 weeks ago
- This is simple REST API project using a modern stack with FastAPI. (Celery, Redis, Postgres, SQLAlchemy, Docker, Docker Compose)☆40Updated 2 years ago
- Async bulk data ingestion and querying in various document, graph and vector databases via their Python clients☆36Updated last year
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆67Updated 3 weeks ago
- 🚀 Kew - A Fast, Redis-backed Task Queue Manager for Python☆34Updated 3 months ago
- This is a demo project to compare two web scrapping frameworks, Playwright and Selenium and using the new Pipelining tool Dagster☆13Updated 3 years ago
- Extensions for Python Markdown☆10Updated 6 months ago
- Benchmark study on LanceDB, an embedded vector DB, for full-text search and vector search☆24Updated last year
- 🛤️ Pathik - High-Performance Web Crawler ⚡☆26Updated 3 weeks ago
- An autonomous agent to automate your code review workflow made using crewAI☆13Updated last year
- Functional composable pipelines allowing clean separation of the business logic and its implementation☆11Updated 10 months ago
- FastAPI-Scheduler is a simple scheduled task management FastAPI extension based on APScheduler.☆99Updated last year
- Add dependencies specified in requirements.txt file(s) to your Poetry or UV project☆34Updated last month
- A simple docker-compose app for orchestrating a fastapi application, a celery queue with rabbitmq(broker) and redis(backend)☆133Updated 2 years ago
- Pluggable DSL that uses pipes to perform a series of linear transformations to extract data☆16Updated 9 months ago
- Experiment on QnA tabular data using LLMs and SQL☆28Updated 6 months ago
- Async MongoDB with vanilla Pydantic v2+ - made easy.☆19Updated 3 months ago
- ☆55Updated last year
- Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provi…☆35Updated last month
- OpenAI compatible API for open source LLMs☆15Updated last year
- FastAPI + Celery = ♥! Learn about those technologies by consulting https://derlin.github.io/introduction-to-fastapi-and-celery/☆94Updated last year
- Demo example of consumer goods categorization☆27Updated last year
- FastAPI + ODMantic example☆62Updated last year
- Finetune LLM to convert an invoice or receipt image to receipt XML or JSON object.☆44Updated 8 months ago