oyvindberg/PDFExtract

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/oyvindberg/PDFExtract)

oyvindberg / PDFExtract

my take at a PDF text extraction utility

☆26

Alternatives and similar repositories for PDFExtract

Users that are interested in PDFExtract are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ckorzen / pdf-text-extraction-benchmark
View on GitHub
A project about benchmarking and evaluating existing PDF extraction tools on their semantic abilities to extract the body texts from PDF …
☆73Nov 7, 2020Updated 5 years ago
tamirhassan / pdfxtk
View on GitHub
PDF Extraction Toolkit
☆43Nov 23, 2020Updated 5 years ago
peterwilliams97 / git-stats
View on GitHub
Compute statistics on git repositories
☆10May 29, 2019Updated 7 years ago
vhyza / lemmagen-lexicons
View on GitHub
Language lexicons for elasticsearch https://github.com/vhyza/elasticsearch-analysis-lemmagen plugin
☆15Dec 11, 2018Updated 7 years ago
plin / slovnik
View on GitHub
Český tvarotvorný slovník
☆14Feb 4, 2019Updated 7 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
ad-freiburg / pdfact
View on GitHub
A basic tool that extracts the structure from the PDF files of scientific articles.
☆77Jan 4, 2022Updated 4 years ago
dlight / pdftotext
View on GitHub
High-level Rust library that binds to Poppler to extract text from a PDF
☆11Dec 16, 2020Updated 5 years ago
chetmurthy / ensemble
View on GitHub
The Ensemble distributed communications toolkit
☆13Jul 26, 2020Updated 5 years ago
wabzqem / vaxproxy
View on GitHub
☆12Sep 21, 2021Updated 4 years ago
manuel / delimgen
View on GitHub
Delimited Generators - Minimal Delimited Control for JS
☆13May 13, 2024Updated 2 years ago
William1617 / DTLN_RKNN
View on GitHub
☆10May 30, 2024Updated 2 years ago
rizo / rego
View on GitHub
Reasonable Go.
☆10Aug 13, 2018Updated 7 years ago
starenka / pandas_djmodel
View on GitHub
Generates Django model definition from Pandas DataFrame
☆17May 25, 2018Updated 8 years ago
nichtich / marginalia
View on GitHub
Extract Annotations from PDF files
☆19Nov 17, 2010Updated 15 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
Diderot-Language / examples
View on GitHub
Examples of using Diderot
☆10Sep 16, 2019Updated 6 years ago
BMKEG / lapdftextProject
View on GitHub
High-level build project for all LAPDF-Text submodules
☆103Jul 2, 2015Updated 11 years ago
janestreet / ppx_hash
View on GitHub
A ppx rewriter that generates hash functions from type expressions and definitions
☆16Jul 10, 2026Updated last week
go-air / dupi
View on GitHub
A tool to find all duplicates in large sets of text documents.
☆16Sep 29, 2021Updated 4 years ago
ohenrik / nb_dep_ud_sm
View on GitHub
Spacy model trained based on Norwegian corpus converted from OBT to Universal dep.
☆13Jan 31, 2018Updated 8 years ago
Nykakin / quantize
View on GitHub
Image quantization in Golang
☆20Mar 20, 2019Updated 7 years ago
liuzl / china-address-code
View on GitHub
国家统计局中国省市县乡村5级地址抓取，http://www.stats.gov.cn/tjsj/tjbz/tjyqhdmhcxhfdm/2018/index.html
☆12Jan 8, 2020Updated 6 years ago
ellisa1419 / Wordnet-Query-Expansion
View on GitHub
☆12Aug 29, 2019Updated 6 years ago
yangyuan / brown-clustering
View on GitHub
Brown clustering in Python
☆22Dec 12, 2017Updated 8 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
go-zoox / chatgpt-client
View on GitHub
ChatGPT-Client is a ChatGPT Client with Offical OpenAI API.
☆12May 30, 2024Updated 2 years ago
domoritz / csv2arrow
View on GitHub
Convert CSV files to Apache Arrow.
☆15Feb 2, 2023Updated 3 years ago
vidurj / parser-adaptation
View on GitHub
☆12Dec 8, 2022Updated 3 years ago
stapelberg / goturbopfor
View on GitHub
Teaching implementation of the TurboPFor integer compression algorithm
☆23Feb 5, 2019Updated 7 years ago
diprism / fggs
View on GitHub
Factor Graph Grammars in Python
☆14Jan 17, 2026Updated 6 months ago
Fakerr / go-paddle
View on GitHub
Go library for accessing the Paddle API
☆10Apr 14, 2022Updated 4 years ago
andreyvit / openai
View on GitHub
Best way to use ChatGPT/GPT-3 with Go: zero dependencies, tokenizer, under 1500 LOC
☆14Jul 18, 2024Updated 2 years ago
hsnr-gamera / gamera-4
View on GitHub
Gamera 4 for Python 3
☆14May 16, 2025Updated last year
philzook58 / snakelog
View on GitHub
A Datalog Framework for Python
☆18Mar 8, 2023Updated 3 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
ziqizhang / sti
View on GitHub
Implementation of algorithms for semantic table implementation, including the TableMiner+ method
☆19Sep 1, 2022Updated 3 years ago
gonzojive / or-tools-go
View on GitHub
WIP on a go library for Google's Operational Research Tools
☆11Jul 6, 2023Updated 3 years ago
Schmavery / refmt-web
View on GitHub
[OLD/Deprecated] Translate OCaml to Reason ON THE WEB
☆18Dec 16, 2016Updated 9 years ago
hieule88 / SpeechSeparation
View on GitHub
Using SepFormer
☆10Feb 2, 2023Updated 3 years ago
zach-klippenstein / go-typedregexp
View on GitHub
typedregexp matches regular expressions into structs.
☆15Jan 20, 2016Updated 10 years ago
semantic-health / allennlp-multi-label
View on GitHub
A multi-label classification plugin for AllenNLP.
☆11Jan 13, 2023Updated 3 years ago
ocurrent / current_incr
View on GitHub
Self-adjusting computations
☆23Oct 9, 2023Updated 2 years ago