StatCan / SLICEmyPDFLinks
This project uses SLICE algorithm to extract information from a text-based PDF page containing financial statements (tabular data). It can also be used to extract regular tables but will contain all text on a page.
☆64Updated 3 years ago
Alternatives and similar repositories for SLICEmyPDF
Users that are interested in SLICEmyPDF are comparing it to the libraries listed below
Sorting:
- Python-based parser for parsing XBRL and iXBRL files☆139Updated 4 months ago
- Securities and Exchange Commission utility package for dealing with Edgar database. Includes methods to download index files and SEC file…☆36Updated 4 years ago
- demo using FuzzyWuzzy matching company names☆75Updated 3 years ago
- Google Colab Demo of CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents☆47Updated 3 years ago
- OpenEDGAR (openedgar.io)☆303Updated 2 years ago
- code for http://www.python4cpas.com/☆36Updated 5 years ago
- Python APIs for Open PermID☆15Updated last year
- ☆53Updated 3 years ago
- Python library to extract tabular data from images and scanned PDFs☆278Updated 10 months ago
- Adobe PDFServices python SDK Samples☆152Updated last month
- PDF Table Extractor - repository to hold revisable version of code from https://www.cvast.tuwien.ac.at/projects/pdf2table by Burcu Yildiz☆38Updated last year
- Helper tools to analyze the " Financial Statement Data Sets" from the U.S. securities and exchange commission (sec.gov)☆65Updated this week
- Feature engineering tool to efficiently create effective, arbitrarily complex arithmetic combinations of numeric features☆10Updated 8 months ago
- A Python tool to help extracting information from structured PDFs.☆404Updated this week
- semantically distinct key phrase extraction using hilbert hashes.☆50Updated 3 years ago
- Access FED API☆35Updated 5 years ago
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets…☆46Updated 3 years ago
- Fuzzy joins for python pandas - easily join different datasets☆59Updated 4 years ago
- ☆40Updated last year
- openseries is a project with tools to analyze financial timeseries of a single asset or a group of assets. It is solely made for daily or…☆28Updated this week
- A simple python library that allows for easy access of the SEC website so that someone can parse filings, collect data, and query documen…☆126Updated 5 months ago
- Download and extract MDA section from edgar 10k forms☆81Updated 9 months ago
- Simplifies use of the Dedupe library via Pandas☆136Updated 2 years ago
- Python wrapper for xpdf☆19Updated 5 years ago
- Using Natural Language Processing to standardize Company Names☆12Updated 3 years ago
- pandas_ui helps you wrangle & explore your data and create custom visualizations without digging through StackOverflow. All inside your J…☆154Updated 3 years ago
- ☆28Updated 9 months ago
- Jupyter Widget for Lux☆76Updated 2 years ago
- Financial investment modeling and advanced engineering economics using Python☆86Updated 2 years ago
- Extracting Semi-Structured Data from PDFs on a large scale☆52Updated 2 years ago