codegram / calbertLinks
Catalan ALBERT (A Lite BERT for self-supervised learning of language representations)
☆14Updated 5 years ago
Alternatives and similar repositories for calbert
Users that are interested in calbert are comparing it to the libraries listed below
Sorting:
- Pre-production releases for Spacy in Catalan☆14Updated 3 years ago
- The RadioTalk dataset of talk radio transcripts☆60Updated 4 years ago
- An asynchronous concurrent pipeline for classifying Common Crawl based on fastText's pipeline.☆86Updated 4 years ago
- Official source for Catalan Language Models and resources made within Aina project.☆25Updated 2 years ago
- Experiments with Hugging Face 🔬 🤗☆44Updated last year
- Reference implementation of the paper "Word Embeddings for Entity-annotated Texts"☆18Updated 6 years ago
- Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.☆130Updated 4 years ago
- Text and Punctuation correction with Deep Learning☆128Updated 5 years ago
- Forced Alignments for Common Voice☆31Updated 5 years ago
- Reddit title generator API based on GPT-2☆19Updated 5 years ago
- Gamma Agreement in Python☆45Updated last year
- Code for my blog post on Generating Words from Embeddings☆23Updated last year
- Markdown template for Dataseets for Datasets☆63Updated 3 years ago
- ☆17Updated last year
- Simple text to phonemes converter for multiple languages☆20Updated 2 years ago
- Neural Elastic Inference and Search☆19Updated 6 years ago
- ☆30Updated 3 years ago
- Package for controllable summarization☆78Updated 2 years ago
- Experiments with generating GPT-2 fanfiction on specified topics.☆11Updated 6 years ago
- 🦁 Nala is an agile open-source voice assistant framework (20+ actions).☆35Updated 2 years ago
- Scansion tool for Spanish texts☆12Updated last year
- A web interface to understand language-specific BERT-models☆18Updated last year
- The repository for the paper "When Do You Need Billions of Words of Pretraining Data?"☆21Updated 5 years ago
- Parallelized automatic corpus collection for ASR. Forked from https://github.com/EgorLakomkin/KTSpeechCrawler☆24Updated 4 years ago
- BERT models for many languages created from Wikipedia texts☆33Updated 5 years ago
- A simple neural truecaser written in pytorch and allennlp.☆33Updated last year
- Multi-lingual Text Processing☆96Updated 6 years ago
- 🔉 A web app to play, visualize, and annotate your audio files for machine learning☆119Updated 5 years ago
- name2nat: a Python package for nationality prediction from a name☆115Updated 5 years ago
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languages☆11Updated last year