Targetted language identifier, based on FastText and Hunspell.
☆38Sep 4, 2025Updated 7 months ago
Alternatives and similar repositories for fastspell
Users that are interested in fastspell are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Tool to fix bitexts and tag near-duplicates for removal☆35Sep 4, 2025Updated 7 months ago
- Library for fast text representation and classification.☆31Jan 9, 2024Updated 2 years ago
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆58Feb 3, 2026Updated 2 months ago
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)☆75Apr 1, 2025Updated last year
- Corset is a web-based data selection portal that helps you getting relevant data from massive amounts of parallel data.☆21Nov 6, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Rust wrapper for the cld2 language detection library.☆16Nov 28, 2017Updated 8 years ago
- [LREC 2024] 🖋 Resource and Tool for Writing System Identification☆21Mar 29, 2026Updated last month
- Transform TMX to text☆27Nov 23, 2022Updated 3 years ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Jun 18, 2024Updated last year
- Lyrics crawling, pre-processing, embedding generation, model training, and lyrics generation - all in one tool☆14Nov 4, 2018Updated 7 years ago
- fasttext with wheels and no external dependency, but only the predict method (<1MB)☆19Nov 23, 2024Updated last year
- Finite state compiler, processor and helper tools used by apertium☆20Jan 29, 2026Updated 3 months ago
- Creating super-parallel corpora of more than 1500+ unique languages for NLP research☆34Dec 8, 2022Updated 3 years ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆38Feb 5, 2026Updated 2 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Extracts plain text, language identification and more metadata from WARC records☆23Apr 16, 2026Updated 2 weeks ago
- Material for a course on Advanced NLP☆16Jul 22, 2025Updated 9 months ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆29Apr 17, 2024Updated 2 years ago
- Tensorflow implementation of the Skipgram model with different scripts to train Portuguese word embeddings.☆18Aug 26, 2017Updated 8 years ago
- The source code for the TIRA Shared Task Platform☆17Updated this week
- Inference slice of marian for bergamot's tiny11 models. Faster to compile, and wield. Fewer model-archs than bergamot-translator.☆14Oct 24, 2024Updated last year
- A Sphinx theme for the CrateDB documentation.☆22Apr 3, 2026Updated 3 weeks ago
- Supports BananaPi BPI -M2 (Kernel3.3)☆11Nov 3, 2016Updated 9 years ago
- Finite-state script normalization and processing utilities☆47Apr 16, 2026Updated 2 weeks ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Code repo for "Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers" (ACL 2023)☆22Nov 1, 2023Updated 2 years ago
- A small wrapper around python logging module which can easily format and write logs to file.☆12Jan 9, 2023Updated 3 years ago
- ☆23Jan 25, 2023Updated 3 years ago
- An Elixir wrapper around the Rust Lingua language detection library.☆16Apr 6, 2026Updated 3 weeks ago
- ☆82Jan 30, 2026Updated 3 months ago
- Data type isomorphic to α ∨ β ∨ (α ∧ β)☆14Apr 27, 2022Updated 4 years ago
- Code for constructing TLDR corpus from Reddit dataset☆27Nov 23, 2021Updated 4 years ago
- Micro-framework for publishing linked data☆11Aug 1, 2017Updated 8 years ago
- Opus codec support for Python.☆32Oct 7, 2022Updated 3 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Bicleaner fork that uses neural networks☆40Feb 23, 2026Updated 2 months ago
- Lossless normalization of uppercase characters☆11Jul 3, 2023Updated 2 years ago
- A huge number library for Purescript with emphasis on correctness.☆12Apr 27, 2022Updated 4 years ago
- Relational Scheme interpreter, written in miniKanren, with Scheme pattern matcher☆11Mar 17, 2015Updated 11 years ago
- An Easy Annotation Tool for Natural Language Processing☆11May 17, 2024Updated last year
- Game Boy Clock Accuracy Challenge☆13Mar 30, 2023Updated 3 years ago
- Generating English Rock lyrics using BERT☆19May 10, 2019Updated 6 years ago