microsoft / Computational-Use-of-Data-Agreement
Computational Use of Data Agreement - Removing Barriers to Data Innovation
☆20Updated last year
Alternatives and similar repositories for Computational-Use-of-Data-Agreement:
Users that are interested in Computational-Use-of-Data-Agreement are comparing it to the libraries listed below
- arXiv submission core☆14Updated 3 years ago
- Exploring implementing a simple tagger using neural network frameworks☆20Updated 2 years ago
- ✨ Models for the NeuralCoref coreference resolution module☆7Updated 6 years ago
- ☆12Updated 3 years ago
- Tools for generating synthetic document corpora☆13Updated last year
- Towards Neural Phrase-based Machine Translation☆12Updated last year
- ☆48Updated 6 years ago
- ✨ Web interface for NeuralCoref coreference resolution☆34Updated last year
- An English lexical database from the Big 🍎, let's go Mets baby love da Mets☆14Updated 2 months ago
- Dataset and code for three Web crawling-related papers from SIGIR-2019, NeurIPS-2019. and ICML-2020.☆39Updated last week
- CyBERTron-LM is a project which collects some pre-trained Transformer-based models.☆12Updated last year
- c++ mosestokenizer☆16Updated 10 months ago
- Doing things with embeddings☆64Updated 2 years ago
- Efficient teacher-student models and scripts to make them☆49Updated last year
- A Streamlit app to add structured tags to a dataset card☆22Updated 2 years ago
- ☆17Updated 4 years ago
- The Mueller Report Corpus V 0.1☆11Updated 4 years ago
- Jupyter extension to visualize dependency structures☆28Updated 6 years ago
- ☆17Updated 6 years ago
- universal tokenizer☆15Updated 3 years ago
- Indra is a Web Service which allows easy access to different distributional semantics models in several languages.☆47Updated 3 years ago
- SALM: Suffix Array and its Applications in Empirical Language Processing by Joy☆11Updated 7 years ago
- FoLiA library for C++☆16Updated this week
- LaMachine - A software distribution of our in-house as well as some 3rd party NLP software - Virtual Machine, Docker, or local compilatio…☆68Updated last year
- ☆11Updated 3 years ago
- Highly specialized crate to parse and use `google/sentencepiece` 's precompiled_charsmap in `tokenizers`☆18Updated 2 years ago
- Extract Data from Wikipedia Tables☆33Updated 7 years ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆18Updated last year
- Python package to compute metrics on an NLU intent parsing pipeline☆13Updated 4 years ago
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆66Updated last month