proycon / clamLinks
Quickly turn command-line applications into RESTful webservices with a web-application front-end. You provide a specification of your command line application, its input, output and parameters, and CLAM wraps around your application to form a fully fledged RESTful webservice.
☆134Updated 3 months ago
Alternatives and similar repositories for clam
Users that are interested in clam are comparing it to the libraries listed below
Sorting:
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆113Updated 11 months ago
- LaMachine - A software distribution of our in-house as well as some 3rd party NLP software - Virtual Machine, Docker, or local compilatio…☆69Updated 2 years ago
- An intelligent reading agent that understands text and translates it into Wikidata statements.☆116Updated 9 years ago
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆47Updated 8 years ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆66Updated last month
- tool for collectively summarizing large discussions☆145Updated 3 years ago
- An expandable and scalable OCR pipeline☆89Updated 8 years ago
- Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl,…☆80Updated last month
- Lightweight, multilingual natural language processing☆63Updated 12 years ago
- Python library for reading and writing warc files☆247Updated 3 years ago
- Put together a multilingual corpus from a variety of sources. Used for wordfreq and word embeddings.☆57Updated 4 years ago
- A Python library for extracting semantic information from text, such as dates and numbers.☆79Updated 3 years ago
- Python/Flask-based website for text analysis workflow. Previous (stable) release is live at:☆122Updated last year
- A python library detect and extract listing data from HTML page.☆108Updated 8 years ago
- Framework for creating and accessing UBY resources – sense-linked lexical resources in standard UBY-LMF format☆22Updated 7 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆52Updated 5 years ago
- Linguistic search for large annotated text corpora, based on Apache Lucene☆119Updated this week
- Github mirror of "wikidata/query/blazegraph" - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer…☆16Updated 5 years ago
- Automatic tagging and analysis of documents in an Apache Solr index for faceted search by RDF(S) Ontologies & SKOS thesauri☆47Updated 4 years ago
- spaCy pipeline component for adding text readability meta data to Doc objects.☆56Updated 6 years ago
- A workflow system for Natural Language Processing.☆21Updated 6 years ago
- Convert a corpus of PDF to clean text files on a distributed architecture☆38Updated last year
- Memory-based shallow parser for Python☆74Updated 6 years ago
- Automatically exported from code.google.com/p/guess-language☆54Updated 3 months ago
- Semanticizest: dump parser and client☆20Updated 9 years ago
- Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.☆108Updated 9 months ago
- A toolkit for clustering web pages based on various similarity measures.☆34Updated 4 years ago
- A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any othe…☆68Updated 3 years ago
- 💫 Scripts, tools and resources for developing spaCy☆126Updated 6 years ago
- Lightning Fast Language Prediction 🚀☆167Updated 5 months ago