icaropires/pdf2dataset

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/icaropires/pdf2dataset)

icaropires / pdf2dataset

Converts a whole subdirectory with a big (or small) volume of PDF documents to a dataset (pandas DataFrame) with error tracking and choice of features

☆18

Alternatives and similar repositories for pdf2dataset

Users that are interested in pdf2dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

petermartens98 / LangChain-AutoGPT-YouTube-Script-Generation-Streamlit-App
View on GitHub
Python web app built on Streamlit, utilizing LangChain and the OpenAI API to automate YouTube title and script generation. The app offers…
☆12May 29, 2023Updated 3 years ago
CristianCosci / BTC_dataset_Generator_Glassnode
View on GitHub
Python script to create a dataset with all the features available on Glassnode for the analysis of the Bitcoin cryptocurrency.
☆12Mar 24, 2023Updated 3 years ago
sgerodes / python-three-commas
View on GitHub
☆10Sep 7, 2022Updated 3 years ago
ModuNLP / hacking_transformers
View on GitHub
☆11Aug 12, 2020Updated 5 years ago
diggsweden / DCAT-AP-SE
View on GitHub
Projekt för DCAT-AP-SE.
☆15Dec 9, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
prometheus-community / snmp
View on GitHub
Tools and configurations for translating SNMP into Prometheus
☆15Jul 19, 2026Updated last week
kydlikebtc / mcp-server-bn
View on GitHub
☆14Mar 24, 2025Updated last year
meirm / ChatGPTApp
View on GitHub
☆13Mar 19, 2024Updated 2 years ago
mickymultani / nvidia-NIM-RAG
View on GitHub
Project demonstrates the power and simplicity of NVIDIA NIM (NVIDIA Inference Model), a suite of optimized cloud-native microservices, by…
☆16Mar 21, 2024Updated 2 years ago
jgravelle / Groquments
View on GitHub
Groquments is a simple demonstration project showcasing how easily PocketGroq can help developers integrate Groq's powerful AI capabiliti…
☆12Sep 19, 2024Updated last year
corradio / footprintmap
View on GitHub
A visualisation of the CO2 emissions of the global economy
☆14Nov 18, 2022Updated 3 years ago
etalab-ia / albert-tchap
View on GitHub
Bot for Tchap (the messaging app of the French State) using Albert, the French administration Artificial Intelligence agent
☆15Nov 14, 2024Updated last year
samos123 / gke-node-ca-importer
View on GitHub
☆14Mar 15, 2023Updated 3 years ago
google / support-case-notifications
View on GitHub
☆13Mar 10, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
mtasic85 / quickjs-cffi-generator
View on GitHub
QuickJS C FFI generator
☆12Nov 21, 2021Updated 4 years ago
drgriffis / text-essence
View on GitHub
Preprocessing and analysis for training SNOMED-CT concept embeddings from CORD-19 corpus
☆16Aug 4, 2023Updated 2 years ago
FranQuant / ML-and-DL-based-Investment-Strategies-for-BTC
View on GitHub
ML & DL based Investment Strategies for BTC using Technical Trading Indicators and On-Chain Data Analysis
☆15Apr 16, 2025Updated last year
bbelderbos / sa-graph
View on GitHub
☆11Sep 4, 2021Updated 4 years ago
staffanm / ferenda
View on GitHub
Transform unstructured document collections to structured Linked Data
☆30Updated this week
neo4j-product-examples / graphrag-customer-experience
View on GitHub
☆15May 28, 2025Updated last year
duckdb / duckdb_httpfs_wasm_experiment
View on GitHub
HTTPFS extension for DuckDB. Adds support for an HTTPFileSytem and S3FileSystem.
☆18Nov 4, 2024Updated last year
arkady-emelyanov / pyarrow-flight
View on GitHub
Apache Arrow Flight example
☆10Nov 9, 2020Updated 5 years ago
CopilotKit / canvas-with-llamaindex-composio
View on GitHub
☆48Mar 12, 2026Updated 4 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
magnetai / mcp-free-usdc-transfer
View on GitHub
MCP (Model Context Protocol) server - free usdc transfer powered by Coinbase CDP
☆21Jan 17, 2025Updated last year
akamai / cli-netstorage
View on GitHub
☆15Jan 31, 2024Updated 2 years ago
maxblee / clipivot
View on GitHub
A tool for creating pivot tables from the command line.
☆14Mar 16, 2023Updated 3 years ago
ltphen / dopla-ai
View on GitHub
Language learning with AI
☆13Oct 11, 2025Updated 9 months ago
ciknight / wbot
View on GitHub
Deprecated，https://github.com/PY-Learning/wbot
☆10Mar 17, 2017Updated 9 years ago
stolostron / integrity-shield
View on GitHub
Integrity Shield is a tool for built-in preventive integrity control for regulated cloud workloads. It provides signature-based assurance…
☆17Sep 22, 2022Updated 3 years ago
ehartford / BetterChatGPT
View on GitHub
An amazing UI for OpenAI's ChatGPT (Website + Windows + MacOS + Linux)
☆19Sep 30, 2023Updated 2 years ago
blueswen / observability-workshop-101
View on GitHub
Build a lab scale end-to-end Observability Platform.
☆23Nov 9, 2023Updated 2 years ago
kaustubhgupta / analytics-vidhya-demo
View on GitHub
This repo contains the code demonstrated in the Analytics Vidhya article about PyWebIO usage and the ML model prediction code.
☆10Apr 22, 2021Updated 5 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
vialab / docuburst-desktop
View on GitHub
Desktop Version of Docuburst
☆19Nov 14, 2016Updated 9 years ago
anselal / docker-isthmos
View on GitHub
Minimalistic Docker UI based on Flask, docker-py and w2ui
☆14Dec 8, 2022Updated 3 years ago
doggy8088 / serverinfoaspx
View on GitHub
用單一支 ASPX 檔案就能顯示完整的伺服器資訊
☆17Aug 19, 2023Updated 2 years ago
qeinfinity / binance-mcp-server
View on GitHub
☆20Dec 29, 2024Updated last year
kubernetes-sigs / release-utils
View on GitHub
☆24Jul 21, 2026Updated last week
EMBEDDIA / TransSHAP
View on GitHub
Interpreting BERT with LIME and SHAP
☆11Jun 12, 2023Updated 3 years ago
UKHomeOffice / docker-symmetricds
View on GitHub
Docker image for one-way replication with SymmetricDS
☆10Dec 2, 2025Updated 7 months ago