KaniyamFoundation / Pdf2TextLinks
Project to convert PDF files to Text files using google OCR
☆13Updated last year
Alternatives and similar repositories for Pdf2Text
Users that are interested in Pdf2Text are comparing it to the libraries listed below
Sorting:
- A project to collect all tamil nouns☆11Updated 7 months ago
- ThamizhiMorph: A Tamil Morphological Analyser and Generator☆20Updated last year
- Tamil Language words list☆11Updated 9 years ago
- OCR for WikiSource using Google Drive OCR☆34Updated last year
- A cloud-based, open-source system for writing and publishing dictionaries.☆93Updated last year
- The e-texts of the SARIT project☆40Updated 2 months ago
- Karthika - A offline Tamil Wiktionary in Python☆17Updated 13 years ago
- Data for the quantitative study of (Vedic) Sanskrit☆127Updated last month
- Resources to go with the Indic NLP Library☆74Updated 3 years ago
- A rule-based iterative affix stripping stemmer for Tamil☆44Updated 2 months ago
- A character-wise tokenizer for morphologically rich languages☆28Updated 5 months ago
- தமிழில் இயல்மொழி ஆய்வுக்கான நிரல்கள், கருவிகள் மற்றும் தரவுகள்☆74Updated 5 months ago
- Morphological analyzer and lemmatizer for Latin.☆27Updated 6 months ago
- Comparable documents miner: Arabic-English morphological analysis, text processing, n-gram features extraction, POS tagging, dictionary t…☆35Updated 8 years ago
- A Python based API to access Indian language WordNets.☆38Updated 3 years ago
- Hindi wordlists, dictionary and affix files in hunspell format☆39Updated 4 years ago
- A context-based spellchecker for correcting OCR output.☆20Updated 2 years ago
- Machine Translation from Sanskrit to Hindi using Unsupervised and Supervised Learning☆19Updated 4 years ago
- A Directory of Online Newspaper Sources for 70+ Languages☆32Updated 4 years ago
- Perseus Treebank Data☆73Updated last year
- Python package for indic script transliteration☆191Updated this week
- A simple text reuse detection CLI tool.☆136Updated last year
- eXtensible Interlinear Glossed Text☆33Updated 3 years ago
- ☆14Updated 4 years ago
- Latin BERT☆66Updated last year
- Various commentaries on Ashtadhyayi of Panini.☆26Updated 2 weeks ago
- Python library for automatic analysis of Ancient Greek hexameter. The algorithm uses linguistic rules and finite-state technology.☆20Updated last year
- I created this repository to provide the DH Community a compilation of free, open-source tools for creating and developing digital humani…☆37Updated 2 years ago
- Snapshots of the GRETIL repository of South Asian (Sanskrit, Pali, etc.) etexts☆9Updated 2 weeks ago
- Soundex Phonetic Code Algorithm Demo for Indian Languages. Supports all indian languages and English. Provides intra-indic string compari…☆58Updated 6 years ago