microsoft / Computational-Use-of-Data-AgreementLinks
Computational Use of Data Agreement - Removing Barriers to Data Innovation
☆20Updated 2 years ago
Alternatives and similar repositories for Computational-Use-of-Data-Agreement
Users that are interested in Computational-Use-of-Data-Agreement are comparing it to the libraries listed below
Sorting:
- Open Use of Data Agreement - Removing Barriers to Data Innovation☆17Updated 3 years ago
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆69Updated 3 weeks ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆65Updated last year
- The Mueller Report Corpus V 0.1☆11Updated 5 years ago
- Recipes for training OpenNMT systems☆14Updated 7 years ago
- Discontinuous Data-Oriented Parsing☆46Updated last year
- OpenNeuroSpell contains parts of NeuroSpell (http://neurospell.com/en.php) released as open-source. More code will be published as soon a…☆20Updated 8 months ago
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆113Updated 5 months ago
- ✨ Models for the NeuralCoref coreference resolution module☆7Updated 6 years ago
- A powerful, tagset-independent and theory-neutral meta model and API for storing, manipulating, and representing nearly all types of ling…☆15Updated 2 years ago
- A machine learning software for extracting information from scholarly documents☆23Updated 4 years ago
- FoLiA library for C++☆16Updated last month
- Framework for creating and accessing UBY resources – sense-linked lexical resources in standard UBY-LMF format☆22Updated 7 years ago
- Democratizing NLP!☆105Updated last year
- Various utilities for processing the data.☆210Updated this week
- Highly specialized crate to parse and use `google/sentencepiece` 's precompiled_charsmap in `tokenizers`☆19Updated 3 years ago
- A web-based, token-level annotation tool for non-standard language data☆10Updated 4 years ago
- Identifying Historical People, Places and other Entities: Shared Task on Named Entity Recognition and Linking on Historical Newspapers at…☆22Updated 11 months ago
- Search back-end for dependency tree search. See the docs at https://fginter.github.io/dep_search/☆17Updated 7 years ago
- A project about benchmarking and evaluating existing PDF extraction tools on their semantic abilities to extract the body texts from PDF …☆68Updated 4 years ago
- Tools for TICCL☆14Updated last month
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆52Updated 5 years ago
- A set of workflows for corpus building through OCR, post-correction and normalisation☆49Updated 2 years ago
- ✨ Web interface for NeuralCoref coreference resolution☆35Updated 2 years ago
- Collaborative NLP annotation tool supporting enterprise authentication, inter-annotator statistics, active learning☆13Updated 2 years ago
- hyp: hypergraphs toolkit☆31Updated 9 years ago
- A neural network that jointly part-of-speech tags and lemmatizes sentences, boosting accuracy for morphologically-rich languages (Czech, …☆34Updated 6 years ago
- An efficient implementation of Partitioned Label Trees & its variations for extreme multi-label classification☆88Updated last year
- CRF-based Morphological Tagging and Lemmatization☆37Updated 5 years ago
- Match tokenized words and phrases within the original, untokenized, often messy, text.☆19Updated 2 years ago