Turn pdf document into simple annotated XML for further processing in a corpus preparation pipeline.
☆13Nov 19, 2019Updated 6 years ago
Alternatives and similar repositories for trickypdf
Users that are interested in trickypdf are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Materials for the 2022 GESIS Training workshop "Tools and Workflows for Reproducible Research in the Quantitative Social Sciences"☆20Nov 19, 2022Updated 3 years ago
- Knack Toolkit Library☆29Updated this week
- A plugin to integrate Facebook with MyBB, letting users login and register through Facebook.☆27Aug 7, 2020Updated 5 years ago
- Create and analyze argument graphs and serialize them via Protobuf☆10Mar 18, 2026Updated last week
- A fully programmable, multi-platform, syntax-slick modern language. Let’s finish this strong. 💪☆22Jun 15, 2025Updated 9 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- GE 2015/17 + EU Ref voter density shiny app☆14Nov 15, 2017Updated 8 years ago
- GermaParl: Corpus of Plenary Protocols of the German Bundestag (TEI Format)☆38Jun 1, 2023Updated 2 years ago
- Corpus In A Box: Automated Tools, Tutorials, & Advising☆11Dec 1, 2022Updated 3 years ago
- A template to write a reproducible paper in R Markdown.☆18Jun 20, 2023Updated 2 years ago
- Terraform playbook of a vulnerable Azure deployment☆11Apr 28, 2022Updated 3 years ago
- Free cybersecurity training resources☆12Feb 5, 2020Updated 6 years ago
- A repository for text_processing tools used by crow☆12Mar 21, 2025Updated last year
- Example of (micro)services to do conversion from Microsoft Word Docx files to PDF using products on Google Cloud Platform☆20Apr 26, 2019Updated 6 years ago
- Proof of concept tool used for phishing multi-factor authentication on O365☆14Aug 8, 2018Updated 7 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆26Mar 12, 2026Updated last week
- markupy - HTML in Python☆21Jun 2, 2025Updated 9 months ago
- Word2Vec in pure Python☆19Jun 13, 2018Updated 7 years ago
- R-package for text mining with the Corpus Workbench (CWB) as backend☆49Mar 26, 2025Updated 11 months ago
- Repository for CRAN package BatchGetSymbols☆18Feb 2, 2026Updated last month
- R function and addin to easily insert tables in Rmd code chunks☆37Sep 16, 2019Updated 6 years ago
- Guide for fixing 99-100% of cracking sound issues on Dell XPS 15 9570☆11Nov 1, 2018Updated 7 years ago
- The `hp2xx' program is a versatile tool to convert vector-oriented graphics data given in Hewlett-Packard's HP-GL plotter language into a…☆18Feb 1, 2020Updated 6 years ago
- Programmatically collect normalized news from (almost) any website using R☆30Sep 20, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Quick and dirty .net console app for querying mssql servers.☆24Aug 30, 2018Updated 7 years ago
- R package for estimating speaker style distinctiveness in texts. Install it from CRAN!☆34Mar 4, 2021Updated 5 years ago
- Topic-Specific Diagnostics for LDA and CTM Topic Models☆25Jul 17, 2022Updated 3 years ago
- ULMFiT Method for German Language☆15May 10, 2019Updated 6 years ago
- Python wrapper for the CWB to extract concordances and score frequency lists☆22Jan 12, 2026Updated 2 months ago
- Windows Batch script to install and setup the Splunk Universal Forwarder☆11Feb 24, 2020Updated 6 years ago
- Transliterate español (spanish) spelling to andaluz proposals using python☆27Jan 13, 2026Updated 2 months ago
- A containerized all-in-one solution for CQPWeb☆18Jan 22, 2023Updated 3 years ago
- Github Pages deployment for Ansible Best Practices☆12Jan 12, 2026Updated 2 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- A set of simple tools to assist users of the Interactive Brokers API.☆28Apr 18, 2024Updated last year
- Extract tables from PDFs using LLMWhisperer and extract structured information from those tables using Langchain☆49Oct 7, 2024Updated last year
- A part-of-speech tagger with support for domain adaptation and external resources.☆24Oct 26, 2022Updated 3 years ago
- HTML5 Canvas-based version of the classic arcade game Frogger☆37Sep 30, 2020Updated 5 years ago
- Some basic CI for Splunk Apps.☆11Jan 8, 2020Updated 6 years ago
- Library of sites for categorization☆28Feb 12, 2019Updated 7 years ago
- Convert a PDF via OCR to a TXT file in UTF-8 encoding☆159Oct 3, 2023Updated 2 years ago