Easy PDF to text to spaCy text extraction in Python.
☆40Dec 29, 2025Updated 4 months ago
Alternatives and similar repositories for spacypdfreader
Users that are interested in spacypdfreader are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Automatic Scraping project for extracting FAQ and Help center articles☆13Dec 8, 2023Updated 2 years ago
- ☆23Aug 13, 2023Updated 2 years ago
- ☆21Dec 4, 2024Updated last year
- ☆18Apr 25, 2021Updated 5 years ago
- A spaCy wrapper of OpenTapioca for named entity linking on Wikidata☆96Feb 5, 2026Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Finds linguistic patterns effortlessly☆39Aug 29, 2023Updated 2 years ago
- Exercises from ParametricCamp- Computational Design Tutorials and Live Streams - From C# to Python☆17Jan 2, 2022Updated 4 years ago
- make variables remember their history☆15Jun 2, 2020Updated 5 years ago
- NLP-helper for OCR-ed pages in PAGE XML format☆10Dec 6, 2024Updated last year
- Write Datasette canned queries as plain SQL files☆14Jul 2, 2022Updated 3 years ago
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆170Nov 7, 2022Updated 3 years ago
- Keyword spaCy is a spaCy pipeline component for extracting keywords from text using cosine similarity.☆13Dec 7, 2023Updated 2 years ago
- Python bindings for the htmd Rust library, a fast HTML to Markdown converter☆13Apr 27, 2026Updated last week
- WhatIf: Software for Evaluating Counterfactuals☆18May 17, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- This repository provides German documentation relating to the text recognition and transcription platform eScriptorium. The documentation…☆15Dec 6, 2025Updated 5 months ago
- Collection of Grasshopper Utilities☆11Mar 29, 2025Updated last year
- ifcParser in Grasshopper and NotePad++ Express Style☆10Jun 28, 2017Updated 8 years ago
- Git-native prompt management and testing framework for production LLM workflows☆25Apr 18, 2026Updated 2 weeks ago
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.☆84Aug 31, 2023Updated 2 years ago
- A Streamlit application to visualize sentence embeddings☆18Dec 21, 2022Updated 3 years ago
- Materials for workshop "Build A Routing Web Application With OpenStreetMap, Neo4j, and Leaflet.js"☆15Sep 21, 2023Updated 2 years ago
- Searching in-memory corpus with Corpus Query Language (CQL)☆19Dec 2, 2024Updated last year
- This repository hosts the dataset for the paper Computer Science Named Entity Recognition in the Open Research Knowledge Graph☆22Jan 8, 2024Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆24Aug 11, 2022Updated 3 years ago
- Adding Marimo to Datasette☆21Mar 24, 2025Updated last year
- Scientific Document Insight Q/A☆37Apr 30, 2026Updated last week
- Component to create custom sidebar for streamlit☆18May 28, 2024Updated last year
- A database change feed for processing work☆11Feb 5, 2021Updated 5 years ago
- A lightweight replacement for virtualenvwrapper when using uv☆32Jan 31, 2025Updated last year
- ☆16Feb 1, 2025Updated last year
- Go(lang) Message Queuing over websockets☆12Jun 12, 2016Updated 9 years ago
- ☆11Sep 24, 2015Updated 10 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Rust crate for auto-discovery of feeds in HTML content☆13Dec 14, 2021Updated 4 years ago
- Python wrapper for the CWB to extract concordances and score frequency lists☆22Jan 12, 2026Updated 3 months ago
- A Helpcenter System Built in Elixir's Phoenix and Ash Frameworks☆21Sep 19, 2025Updated 7 months ago
- moddwatch watches files and directories for modifications☆18May 23, 2025Updated 11 months ago
- ☆13Sep 30, 2025Updated 7 months ago
- A basic plugin for Grasshopper to read and write shapefile and geojson (GPL)☆18May 27, 2023Updated 2 years ago
- Reinforcement plugin for Grasshopper☆14May 1, 2025Updated last year