codegram / calbert
Catalan ALBERT (A Lite BERT for self-supervised learning of language representations)
☆13Updated 4 years ago
Related projects: ⓘ
- A raspberry pi 64bit image with spacy and neuralcoref pre-installed☆21Updated 4 years ago
- Extremely easy to use sequence to sequence library with attention, for text to text conversion tasks.☆39Updated 3 years ago
- A web interface to understand language-specific BERT-models☆17Updated 5 months ago
- spaCy match and replace, maintaining conjugation☆34Updated last year
- Experiments with Hugging Face 🔬 🤗☆45Updated last month
- Experiments with generating GPT-2 fanfiction on specified topics.☆11Updated 5 years ago
- ☆29Updated 2 years ago
- Source code for the Apple reproduction☆30Updated 3 years ago
- A simple neural truecaser written in pytorch and allennlp.☆31Updated 3 months ago
- Simple and clean Python implementation of TextRank as per seminal paper by Rada Mihalcea and Paul Tarau. This implementation performs bot…☆11Updated 3 years ago
- ☆10Updated 3 years ago
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.☆85Updated 3 years ago
- ☆13Updated 4 years ago
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languages☆11Updated 7 months ago
- BERT models for many languages created from Wikipedia texts☆34Updated 4 years ago
- A web application tagging and retrieval of arguments in text☆29Updated last year
- The Seshat audio annotation management platform☆13Updated 3 years ago
- The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques☆29Updated 4 years ago
- ☆16Updated last month
- Transformer based Trigram Blocking implementation in Tensorflow☆11Updated 4 years ago
- Reddit title generator API based on GPT-2☆20Updated 4 years ago
- ALMa (Active Learning Manager) Keeps track of labeled and unlabeled data for active learning☆42Updated 4 years ago
- fastlangid, the only language identification package that support cantonese (zh-yue), simplified (zh-hans) and traditional chinese (zh-ha…☆38Updated last year
- Code for my blog post on Generating Words from Embeddings☆23Updated last month
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- An example of how to use spaCy for extremely large files without running into memory issues☆36Updated 2 years ago
- The code describes how to load fastText vectors onto spaCy☆18Updated 3 years ago
- ☆9Updated 3 years ago
- MoodCat😼 classifies the mood of English sentences.☆13Updated 2 years ago
- Dataset Release for Intent Classification from Speech☆43Updated last year