Converts a whole subdirectory with a big (or small) volume of PDF documents to a dataset (pandas DataFrame) with error tracking and choice of features
☆19Jan 9, 2025Updated last year
Alternatives and similar repositories for pdf2dataset
Users that are interested in pdf2dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Quels élus de la République (députés, ministres, maires) utilisent toujours x.com ?☆14May 24, 2026Updated last week
- OpenBudgets Participatory Budgeting☆13Sep 19, 2018Updated 7 years ago
- Les différents registres publics des représentants d'intérêts en OpenData☆18Jan 31, 2023Updated 3 years ago
- Python web app built on Streamlit, utilizing LangChain and the OpenAI API to automate YouTube title and script generation. The app offers…☆12May 29, 2023Updated 3 years ago
- ☆13Mar 9, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Decidim.org public landing website. Made with Middleman.☆10May 22, 2026Updated last week
- ☆13Mar 19, 2024Updated 2 years ago
- Student app written in c# and .NET MAUI☆10Aug 18, 2024Updated last year
- Follow the progress of Emmanuel Macron's governement☆13Aug 16, 2023Updated 2 years ago
- Groquments is a simple demonstration project showcasing how easily PocketGroq can help developers integrate Groq's powerful AI capabiliti…☆12Sep 19, 2024Updated last year
- MCP Analyst is an MCP server that empowers claude to analyze local CSV or Parquet files.☆18Apr 6, 2025Updated last year
- ☆13Mar 24, 2025Updated last year
- Conjunto de scripts para treinar um Sistema de Recomendação Híbrido baseado nos algoritmos do scikit-learn☆16Nov 14, 2016Updated 9 years ago
- ☆10Sep 7, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Automatic transcription and translation for Zoom meetings☆13Sep 7, 2020Updated 5 years ago
- Per-collection OCR leaderboards using VLM-as-judge☆59Mar 23, 2026Updated 2 months ago
- Project demonstrates the power and simplicity of NVIDIA NIM (NVIDIA Inference Model), a suite of optimized cloud-native microservices, by…☆16Mar 21, 2024Updated 2 years ago
- Replication materials for "Identifying the Development and Application of Artificial Intelligence in Scientific Text"☆14Feb 18, 2020Updated 6 years ago
- ☆36Mar 11, 2025Updated last year
- Preprocessing and analysis for training SNOMED-CT concept embeddings from CORD-19 corpus☆16Aug 4, 2023Updated 2 years ago
- ☆21Sep 27, 2024Updated last year
- open source for citizen participation platforms of Seoul Metropolitan Government☆14Nov 16, 2022Updated 3 years ago
- Projekt för DCAT-AP-SE.☆15Dec 9, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Bot for Tchap (the messaging app of the French State) using Albert, the French administration Artificial Intelligence agent☆15Nov 14, 2024Updated last year
- Datasets featuring global, high-level flight schedules extracted from aircraft ADS-B position transmissions. Published per quarter of a y…☆25Apr 11, 2026Updated last month
- Generation of diagrams and flowchart for WordPress.☆20Mar 3, 2024Updated 2 years ago
- Language learning with AI☆13Oct 11, 2025Updated 7 months ago
- QuickJS C FFI generator☆12Nov 21, 2021Updated 4 years ago
- Making information from regeringen.se more accessible☆16May 24, 2026Updated last week
- Offline LLM chatbot with personalized memory — works on CPU with multi-session memory support.☆22Jan 10, 2026Updated 4 months ago
- Generic ASM Vulnerability Schema XSLT☆12May 30, 2018Updated 8 years ago
- HTTPFS extension for DuckDB. Adds support for an HTTPFileSytem and S3FileSystem.☆19Nov 4, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Transcribe your audio and video files locally, totally secure☆17Mar 3, 2026Updated 2 months ago
- Translations of the GAFAM poster campaign by La Quadrature du Net☆21May 4, 2026Updated 3 weeks ago
- A quick glimpse in the Swedish government's remisser☆15Apr 25, 2024Updated 2 years ago
- A universal MCP (Model Context Protocol) server to integrate any API with Claude Desktop using only Docker configurations.☆41Mar 25, 2026Updated 2 months ago
- A tool for creating pivot tables from the command line.☆14Mar 16, 2023Updated 3 years ago
- Deprecated,https://github.com/PY-Learning/wbot☆11Mar 17, 2017Updated 9 years ago
- (WIP) various language support for libpglite native☆23Aug 5, 2025Updated 9 months ago