Wrapper for pdftohtml that tries to extract paragraph structure
☆52Nov 29, 2018Updated 7 years ago
Alternatives and similar repositories for pdf2html
Users that are interested in pdf2html are comparing it to the libraries listed below
Sorting:
- Zurich Morphological Lexicon for German: a tool to extract a morphological lexicon from Wiktionary☆12Aug 10, 2023Updated 2 years ago
- Links parts of input text to Wikipedia articles☆16Sep 9, 2012Updated 13 years ago
- Python natural language processing work☆29Sep 14, 2009Updated 16 years ago
- Parser for KAF NAF files written in Python☆16Jul 1, 2021Updated 4 years ago
- ☆21Apr 4, 2015Updated 10 years ago
- A BiRNN framework implemented in Python and TensorFlow to extract parallel sentences from aligned comparable corpora.☆33Sep 4, 2018Updated 7 years ago
- Financial Analysis and Algorithmic Trading Strategies in Python☆11Feb 16, 2023Updated 3 years ago
- Files for the Karma tutorial at TCDL, Texas Conference on Digital Libraries☆29Apr 17, 2016Updated 9 years ago
- ☆12Oct 28, 2021Updated 4 years ago
- Linux系统与网络管理课程作业收集 http://sec.cuc.edu.cn☆10Mar 14, 2022Updated 3 years ago
- A framework, data and configs for generating and building Tesseract OCR lang.traineddata model files, specifically for Japanese☆10Dec 9, 2013Updated 12 years ago
- maximum entropy based part-of-speech tagger for NLTK☆45Dec 8, 2016Updated 9 years ago
- A website with interesting data for Somerville residents☆32Dec 13, 2017Updated 8 years ago
- Madek main web interface☆21Updated this week
- ☆14Updated this week
- (Labeled) Latent Dirichlet Allocation on a sentence level with Gibbs Sampling☆10Mar 27, 2014Updated 11 years ago
- Container to test Ansible roles in, including capabilities to use openrc facilities☆11Sep 24, 2025Updated 5 months ago
- Focused Crawler for VT's CTRNet☆10May 13, 2013Updated 12 years ago
- Grecka is a python script to convert Greek to Greeklish based on ELOT 743☆12Aug 4, 2018Updated 7 years ago
- Documentation sources for syslog-ng Open Source Edition (https://github.com/syslog-ng/syslog-ng)☆10May 6, 2024Updated last year
- HTML renderer and Schematron rules for NISO STS☆12Dec 5, 2025Updated 2 months ago
- Redis tcp map for postfix☆12Jun 28, 2024Updated last year
- Tox plugin for WeeChat☆50Apr 2, 2019Updated 6 years ago
- Examples for Consul ACLs for single or multiple datacenters.☆12Jul 31, 2020Updated 5 years ago
- Experimental Redis plugin for Vim☆13Jun 8, 2013Updated 12 years ago
- Workflow: Convert CONTROL-freec output to GISTIC2☆10Apr 20, 2021Updated 4 years ago
- Brand disambiguator for tweets to differentiate e.g. Orange vs orange (brand vs foodstuff), using NLTK and scikit-learn☆58Jul 11, 2013Updated 12 years ago
- An EasyMotion plugin for Qt Creator☆11Feb 1, 2016Updated 10 years ago
- ☆13Jul 21, 2016Updated 9 years ago
- TAUS Dynamic Quality Framework API☆12Sep 17, 2020Updated 5 years ago
- A simple python script written using Selenium and BeautifulSoup to extract every horse's information in the races.☆18Dec 24, 2017Updated 8 years ago
- Super efficient TCP connection between remote processes☆12Apr 7, 2016Updated 9 years ago
- 2019 Fall - Game theory and Multi-agent RL Termproject☆10Dec 13, 2019Updated 6 years ago
- An example of centralising clojure/java logging with Logback, LogStash, ElasticSearch, and Kibana☆17Mar 27, 2014Updated 11 years ago
- Small simple projects using zetes library☆35Jul 25, 2015Updated 10 years ago
- [ESWC '24] This repo is official implementation for the paper "Towards Harnessing Large Language Models as Autonomous Agents for Semantic…☆10May 25, 2024Updated last year
- Javascript bayesian network library, inference, learning☆15Nov 8, 2019Updated 6 years ago
- Python interface for the Berkeley Parser using JPype☆12Dec 18, 2015Updated 10 years ago
- The Facile API is capable of reading (decompiling) .Net assemblies. Covering the metadata tables, the embedded types and methods, includi…☆10Feb 23, 2020Updated 6 years ago