proycon / clamLinks
Quickly turn command-line applications into RESTful webservices with a web-application front-end. You provide a specification of your command line application, its input, output and parameters, and CLAM wraps around your application to form a fully fledged RESTful webservice.
☆132Updated 6 months ago
Alternatives and similar repositories for clam
Users that are interested in clam are comparing it to the libraries listed below
Sorting:
- LaMachine - A software distribution of our in-house as well as some 3rd party NLP software - Virtual Machine, Docker, or local compilatio…☆68Updated 2 years ago
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆113Updated 7 months ago
- An expandable and scalable OCR pipeline☆87Updated 7 years ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆65Updated last year
- An intelligent reading agent that understands text and translates it into Wikidata statements.☆116Updated 9 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆53Updated 4 years ago
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆73Updated last week
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆46Updated 7 years ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆38Updated last year
- Link Wikidata items to large catalogs☆96Updated 6 months ago
- Lightweight, multilingual natural language processing☆63Updated 12 years ago
- Automatic tagging and analysis of documents in an Apache Solr index for faceted search by RDF(S) Ontologies & SKOS thesauri☆47Updated 3 years ago
- A queue-controlled browser automation tool for improving web crawl quality☆62Updated last month
- Python/Flask-based website for text analysis workflow. Previous (stable) release is live at:☆122Updated last year
- tool for collectively summarizing large discussions☆145Updated 2 years ago
- spaCy pipeline component for adding text readability meta data to Doc objects.☆56Updated 6 years ago
- A workflow system for Natural Language Processing.☆21Updated 5 years ago
- Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl,…☆78Updated 2 months ago
- Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & N…☆272Updated 2 years ago
- A set of workflows for corpus building through OCR, post-correction and normalisation☆49Updated 3 years ago
- Linguistic search for large annotated text corpora, based on Apache Lucene☆116Updated this week
- Wikipedia API wrapper for humans and elk. (en.wikipedia.org/w/api.php, get it?)☆36Updated 11 years ago
- Record Linkage ToolKit (Find and link entities)☆110Updated 2 years ago
- Python library for reading and writing warc files☆244Updated 3 years ago
- A visualisation tool for Spacy using Hierplane.☆65Updated 2 years ago
- A Python library for extracting semantic information from text, such as dates and numbers.☆77Updated 3 years ago
- Now included in rigour☆151Updated 2 weeks ago
- An interface for interacting with MediaWiki☆37Updated 3 years ago
- A PDF classifier ensemble with REST API service☆23Updated 4 years ago
- A spell-checker extending Peter Norvig's with multi-typo correction, hamming distance weighting, and more.☆98Updated 4 years ago