Library for fast text representation and classification.
☆31Jan 9, 2024Updated 2 years ago
Alternatives and similar repositories for fasterText
Users that are interested in fasterText are comparing it to the libraries listed below
Sorting:
- ☆13Aug 23, 2024Updated last year
- Bicleaner fork that uses neural networks☆40Jan 29, 2026Updated last month
- ☆38Apr 17, 2024Updated last year
- Extracts plain text, language identification and more metadata from WARC records☆23Oct 1, 2025Updated 4 months ago
- A Python utility for indexing file lines. Best demo honourable mention at ECIR 2024.☆23Nov 9, 2025Updated 3 months ago
- An OpenAI API compatible LLM inference server based on ExLlamaV2.☆25Feb 9, 2024Updated 2 years ago
- PyTorch implementation of NAACL 2021 paper "Multi-view Subword Regularization"☆26Jun 2, 2021Updated 4 years ago
- Transform TMX to text☆28Nov 23, 2022Updated 3 years ago
- ☆12Jan 17, 2026Updated last month
- ☆133Jan 22, 2026Updated last month
- Repository for our ICLR 2019 paper: Discovery of Natural Language Concepts in Individual Units of CNNs☆26Mar 9, 2019Updated 6 years ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Jun 18, 2024Updated last year
- Cartesia Line SDK for voice agents.☆92Updated this week
- ☆32Mar 30, 2023Updated 2 years ago
- ☆36Nov 15, 2023Updated 2 years ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆35Feb 5, 2026Updated 3 weeks ago
- ☆34Nov 22, 2021Updated 4 years ago
- Arabic News Stance Corpus☆11Feb 5, 2021Updated 5 years ago
- A parallel evaluation data set of SAP software documentation with document structure annotation☆14Jul 30, 2025Updated 7 months ago
- Extract information from XBRL files in the ESEF format☆13Jan 3, 2026Updated last month
- ☆39Oct 3, 2022Updated 3 years ago
- Efficient Low-Memory Aligner☆146Jan 15, 2025Updated last year
- Tools for managing datasets for governance and training.☆90Jan 19, 2026Updated last month
- A High-Quality Multilingual Dataset for Structured Documentation Translation☆37May 1, 2025Updated 9 months ago
- A probabilistic approximate DNF counter☆39Nov 30, 2025Updated 3 months ago
- Lightweight, multilingual natural language processing☆63Apr 8, 2013Updated 12 years ago
- IAI Style Guide☆10Jun 27, 2025Updated 8 months ago
- ApertureDB Python Client☆12Jan 14, 2026Updated last month
- Romanian Word Embeddings. Here you can find pre-trained corpora of word embeddings. Current methods: CBOW, Skip-Gram, Fast-Text (from Gen…☆13Oct 6, 2025Updated 4 months ago
- Rust port of annoy (https://github.com/spotify/annoy)☆45Aug 19, 2025Updated 6 months ago
- Text preprocessing package for use in NLP tasks https://pypi.org/project/textcl/☆11Aug 9, 2024Updated last year
- Implementation of SiameseXML (ICML 2021)☆40Oct 26, 2022Updated 3 years ago
- Walks through building different HTML5 layouts for AV systems☆12Oct 15, 2021Updated 4 years ago
- Extract annotated misspellings from MIMIC-III.☆13Dec 17, 2020Updated 5 years ago
- This is a telegram bot for correcting language mistakes in group chats☆10Jun 29, 2021Updated 4 years ago
- COMET for African languages☆10Jan 24, 2025Updated last year
- PyTorch Implementation of Context-Aware Sequential Model for Multi-Behaviour Recommendation https://arxiv.org/abs/2312.09684☆10May 31, 2024Updated last year
- Statistics from our binary transformation framework☆11Jan 16, 2025Updated last year
- ☆40May 2, 2021Updated 4 years ago