ahmedkhemiri95 / PDFs-TextExtract
Multiple and Large PDF Documents Text Extraction.
☆129Updated 9 months ago
Related projects ⓘ
Alternatives and complementary repositories for PDFs-TextExtract
- Python library to extract tabular data from images and scanned PDFs☆264Updated 3 months ago
- A python library for extracting text from PDFs without losing the formatting of the PDF content.☆73Updated 2 years ago
- Python scripts to extract text from PDFs, save it as a text file, export a list of words and their frequencies to a CSV file for further …☆37Updated 7 years ago
- Run OCR, extract information from documents and classify them. In addition, annotate documents and build custom NLP and computer vision m…☆62Updated this week
- Convert a PDF via OCR to a TXT file in UTF-8 encoding☆141Updated last year
- Demos, examples and utilities using PyMuPDF☆578Updated 4 months ago
- Document Search Engine Tool☆71Updated last year
- Text Summarization using NLP to fetch BBC News Article and summarize its text and also it includes custom article Summarization☆41Updated last year
- NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, paraphrasing, …☆76Updated 8 months ago
- Using the Gmail API to topic model my recommended Medium reads☆24Updated 3 years ago
- Extracting Semi-Structured Data from PDFs on a large scale☆51Updated 2 years ago
- Custom recipe and utilities for document processing☆198Updated 2 years ago
- This is an application that automates the process of text analysis with a user-friendly GUI. 📱 It has been implemented using Python and …☆34Updated 2 years ago
- This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.☆86Updated 3 years ago
- Search for and retrieve US Patent and Trademark Office Patent Data☆76Updated 4 years ago
- A comprehensive tutorial for OCR in python using Tesseract-OCR and OpenCV☆118Updated 2 years ago
- Using Natural Language Processing to standardize Company Names☆12Updated 3 years ago
- Open Access PDF harvester, metadata aggregator and full-text ingester☆55Updated 6 months ago
- Quote extraction for modular journalism (JournalismAI collab 2021)☆226Updated 2 years ago
- Python API for RapidMiner Studio and Server.☆48Updated 2 weeks ago
- ☆167Updated 2 years ago
- The WIPO Manual on Open Source Patent Analytics☆49Updated 2 years ago
- OpenNyAI is a mission aimed at developing open source software and datasets to catalyze the creation of AI-powered solutions to improve a…☆70Updated 6 months ago
- A client library for accessing the USPTO Open Data APIs, written in Python.☆91Updated 2 years ago
- A curated list of resources around PDF files☆108Updated 3 months ago
- Web scraping the popular job listing site "Glassdoor" with Python and BeautifulSoup. Implemented from scratch.☆72Updated 4 months ago
- NLP tool for scraping text from a corpus of PDF files, embedding the sentences in the text and finding semantically similar sentences to …☆36Updated 2 years ago
- A series of notebooks demonstrating how to build simple NLP web apps with Gradio and Hugging Face transformers☆45Updated 3 years ago
- Case Studies on Forensic Accounting using Data Analysis☆43Updated 5 years ago
- Mastering spaCy, published by Packt☆126Updated last year