clarinsi/csmtiser

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/clarinsi/csmtiser)

clarinsi / csmtiser

A tool for text normalisation via character-level machine translation

☆13

Alternatives and similar repositories for csmtiser

Users that are interested in csmtiser are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

comphist / norma
View on GitHub
A tool for automatic spelling normalization
☆22Jan 18, 2021Updated 5 years ago
antonisa / embeddings
View on GitHub
Data and scripts for the proper evaluation of cross-lingual embeddings in multiple languages
☆15Apr 11, 2020Updated 6 years ago
kuczera / Graphentechnologien
View on GitHub
Digitale Geisteswissenschaften rund um Graphentechnologien
☆10Feb 12, 2026Updated 5 months ago
pranav-ust / 2kenize
View on GitHub
Upcoming ACL 2020 paper
☆26May 8, 2020Updated 6 years ago
stickeritis / sticker2
View on GitHub
Further developed as SyntaxDot: https://github.com/tensordot/syntaxdot
☆13Dec 18, 2020Updated 5 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
MiMoText / roman18
View on GitHub
Collection de romans français du dix-huitième siècle (1751-1800) / Collection of Eighteenth-Century French Novels (1751-1800)
☆23Apr 23, 2024Updated 2 years ago
clarinsi / tweetcat
View on GitHub
TweetCaT - a tool for building Twitter corpora of smaller languages or specific geographical regions
☆12May 18, 2017Updated 9 years ago
clarin-eric / parla-clarin
View on GitHub
Schema for modelling parliamentary debates
☆23May 23, 2022Updated 4 years ago
MirunaPislar / multi-head-attention-labeller
View on GitHub
Joint text classification on multiple levels with multiple labels, using a multi-head attention mechanism to wire two prediction tasks to…
☆16Oct 17, 2020Updated 5 years ago
MU94W / Tacotron
View on GitHub
TACOTRON: TOWARDS END-TO-END SPEECH SYNTHESIS
☆16Sep 26, 2017Updated 8 years ago
IPIF / prosopogrAPhI
View on GitHub
Tentative way towards a shared API for prosopographical data based on the factoid model (Bradley/Short 2005)
☆24Aug 25, 2022Updated 3 years ago
lenakmeth / Wikinflection-Corpus
View on GitHub
The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheni…
☆12Dec 15, 2023Updated 2 years ago
emanjavacas / pie
View on GitHub
A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.
☆25Oct 27, 2023Updated 2 years ago
ufal / clarin-dspace
View on GitHub
clarin-dspace digital repository based on DSpace and LINDAT/CLARIN DSpace
☆28Updated this week
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
arne-cl / brat-embedded-visualization-examples
View on GitHub
minimal examples of brat annotation visualizations
☆17Jan 21, 2015Updated 11 years ago
OpenTOSCA / winery
View on GitHub
A web-based environment to graphically model TOSCA topologies.
☆17May 14, 2024Updated 2 years ago
stephenroller / word2vecfz
View on GitHub
Dependency-based Word Embeddings (Levy and Goldberg, 2014) with BZ2 compression support.
☆21Jan 13, 2016Updated 10 years ago
HuygensING / TAG
View on GitHub
☆25May 27, 2021Updated 5 years ago
LinxinS97 / NLPBench
View on GitHub
NLPBench: Evaluating NLP-Related Problem-solving Ability in Large Language Models
☆10Oct 27, 2023Updated 2 years ago
kenambrose-GSA / CDO-Council-Public-Comment-Analysis-Project
View on GitHub
Public Comment Analysis Project for the Federal Chief Data Officer Council. The Comment Analysis pilot has shown that a toolset leveragin…
☆13Sep 17, 2021Updated 4 years ago
xaiguy / chippy
View on GitHub
☆13Feb 26, 2023Updated 3 years ago
oxygenxml / TEI-Facsimile-Plugin
View on GitHub
A plugin that provides support for working with Digital Facsimiles in Text Encoding Initiative (TEI) vocabulary. The plugin contribute…
☆25Jun 16, 2025Updated last year
erlangen-crm / ecrm
View on GitHub
Erlangen CRM - An OWL implementation of the CIDOC Conceptual Reference Model
☆46Sep 20, 2024Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
phalodi / Email_Spam_Spark
View on GitHub
In this small project we will predict the email that in which folder it will go in spam or primary.
☆11Jul 5, 2016Updated 10 years ago
GChrysostomou / ood_faith
View on GitHub
☆13Jul 26, 2023Updated 3 years ago
nbroad1881 / strideformer
View on GitHub
Using short models to classify long texts
☆21Mar 8, 2023Updated 3 years ago
nie-ine / Ontologies
View on GitHub
OWL-ontologies for Humanities, developed in the NIE-INE project (National Infrastructure for Editions)
☆20Mar 16, 2021Updated 5 years ago
Pzoom522 / HistSumm
View on GitHub
Code and data for "Summarising Historical Text in Modern Languages" (EACL 2021)
☆71Apr 22, 2021Updated 5 years ago
BlackBoiler / legal-nlp-papers
View on GitHub
A repository of legal NLP research papers.
☆13Jan 3, 2020Updated 6 years ago
soshial / text-normalization
View on GitHub
Python tool for normilizing text and text canonicalization (DISCONTINUED)
☆41Sep 3, 2013Updated 12 years ago
benhutchins / docker-mediawiki
View on GitHub
Docker container for MediaWiki
☆30Apr 8, 2021Updated 5 years ago
brnhffmnn / qmail-aliasfilter
View on GitHub
a smart filter script for all qmail lovers
☆17Aug 6, 2014Updated 11 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
epfl-dlab / Cr5
View on GitHub
Code and data for the WSDM '19 paper "Crosslingual Document Embedding as Reduced-Rank Ridge Regression (Cr5)"
☆30Aug 17, 2019Updated 6 years ago
zorec / angular-libraries
View on GitHub
A curated list of Angular 2 libraries
☆24Jan 29, 2017Updated 9 years ago
G-Research / dgraph-dbpedia
View on GitHub
Pre-processing DBpedia datasets to load into Dgraph
☆13Mar 6, 2022Updated 4 years ago
LiyuanLucasLiu / Fast-Furious-Paper
View on GitHub
Reading Group @ DMG
☆11Nov 15, 2018Updated 7 years ago
proycon / gecco
View on GitHub
Generic Environment for Context-Aware Correction of Orthography
☆24Sep 7, 2022Updated 3 years ago
w3c / shacl
View on GitHub
SHACL Community Group (Post-REC activitities)
☆37Jan 27, 2025Updated last year
motazsaad / arabic-light-stemmer
View on GitHub
Arabic light stemmer. Light stemming for Arabic words removes prefixes and suffixes and normalizes words
☆19Dec 16, 2021Updated 4 years ago