A tool for text normalisation via character-level machine translation
☆13Jun 12, 2020Updated 5 years ago
Alternatives and similar repositories for csmtiser
Users that are interested in csmtiser are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Compiled tools, datasets, and other resources for historical text normalization.☆21Jun 18, 2019Updated 6 years ago
- Data and scripts for the proper evaluation of cross-lingual embeddings in multiple languages☆15Apr 11, 2020Updated 6 years ago
- Digitale Geisteswissenschaften rund um Graphentechnologien☆10Feb 12, 2026Updated 3 months ago
- A powerful, tagset-independent and theory-neutral meta model and API for storing, manipulating, and representing nearly all types of ling…☆15Mar 27, 2023Updated 3 years ago
- ASR transcription and SLU annotation web interface for call logs collected at UFAL-DSG.☆11Dec 8, 2014Updated 11 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Upcoming ACL 2020 paper☆26May 8, 2020Updated 6 years ago
- Further developed as SyntaxDot: https://github.com/tensordot/syntaxdot☆13Dec 18, 2020Updated 5 years ago
- Collection de romans français du dix-huitième siècle (1751-1800) / Collection of Eighteenth-Century French Novels (1751-1800)☆23Apr 23, 2024Updated 2 years ago
- TweetCaT - a tool for building Twitter corpora of smaller languages or specific geographical regions☆12May 18, 2017Updated 9 years ago
- Tentative way towards a shared API for prosopographical data based on the factoid model (Bradley/Short 2005)☆24Aug 25, 2022Updated 3 years ago
- Lexicons for the Multilingual UCREL Semantic Analysis System☆49Mar 11, 2026Updated 2 months ago
- clarin-dspace digital repository based on DSpace and LINDAT/CLARIN DSpace☆28Updated this week
- minimal examples of brat annotation visualizations☆17Jan 21, 2015Updated 11 years ago
- A makeshift python program which relies on nltk and Stanford Core NLP models to expand common contractions in the english language.☆10Nov 8, 2017Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- NLPBench: Evaluating NLP-Related Problem-solving Ability in Large Language Models☆10Oct 27, 2023Updated 2 years ago
- Code for the paper "Refining Language Model with Compositional Explanation" (NeurIPS 2021)☆11Oct 25, 2021Updated 4 years ago
- Public Comment Analysis Project for the Federal Chief Data Officer Council. The Comment Analysis pilot has shown that a toolset leveragin…☆13Sep 17, 2021Updated 4 years ago
- Erlangen CRM - An OWL implementation of the CIDOC Conceptual Reference Model☆45Sep 20, 2024Updated last year
- A plugin that provides support for working with Digital Facsimiles in Text Encoding Initiative (TEI) vocabulary. The plugin contribute…☆25Jun 16, 2025Updated 11 months ago
- ☆13Jul 26, 2023Updated 2 years ago
- Automated Twitter bots, run by the artificial artificial intelligence of Amazon Mechanical Turk.☆32Dec 23, 2010Updated 15 years ago
- This repository contains simple code in Python to help historians prepare data for quantitative analysis & visualization. Visit the follo…☆27May 11, 2026Updated 2 weeks ago
- OWL-ontologies for Humanities, developed in the NIE-INE project (National Infrastructure for Editions)☆20Mar 16, 2021Updated 5 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- The code for NeurIPS 2020 paper: Adversarial Crowdsourcing Through Robust Rank-One Matrix Completion.☆10Oct 26, 2020Updated 5 years ago
- A repository of legal NLP research papers.☆12Jan 3, 2020Updated 6 years ago
- Python tool for normilizing text and text canonicalization (DISCONTINUED)☆41Sep 3, 2013Updated 12 years ago
- a smart filter script for all qmail lovers☆17Aug 6, 2014Updated 11 years ago
- Code and data for the WSDM '19 paper "Crosslingual Document Embedding as Reduced-Rank Ridge Regression (Cr5)"☆30Aug 17, 2019Updated 6 years ago
- TurkGate: Grouping and Access Tools for External surveys (for use with Amazon Mechanical Turk)☆27Oct 27, 2015Updated 10 years ago
- Benchmark datasets for sentiment analysis☆12May 18, 2020Updated 6 years ago
- Generic Environment for Context-Aware Correction of Orthography☆23Sep 7, 2022Updated 3 years ago
- A curated list of Angular 2 libraries☆24Jan 29, 2017Updated 9 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- SHACL Community Group (Post-REC activitities)☆37Jan 27, 2025Updated last year
- Arabic light stemmer. Light stemming for Arabic words removes prefixes and suffixes and normalizes words☆19Dec 16, 2021Updated 4 years ago
- TweetBERT: A Pretrained Language Representation Model for Twitter Text Analysis☆15Jun 1, 2022Updated 3 years ago
- ☆15Sep 5, 2016Updated 9 years ago
- Implementation of an Openset Recognition algorithm.☆12Sep 13, 2020Updated 5 years ago
- T2NER: Transformers based Transfer Learning Framework for Named Entity Recognition (EACL 2021)☆11Sep 24, 2022Updated 3 years ago
- 第一次参加大数据比赛☆11Jan 14, 2018Updated 8 years ago