Library for fast text representation and classification.
☆31Jan 9, 2024Updated 2 years ago
Alternatives and similar repositories for fasterText
Users that are interested in fasterText are comparing it to the libraries listed below
Sorting:
- ☆13Aug 23, 2024Updated last year
- Statistics on multilingual datasets☆17Jul 12, 2022Updated 3 years ago
- Bicleaner fork that uses neural networks☆40Feb 23, 2026Updated 3 weeks ago
- A Python utility for indexing file lines. Best demo honourable mention at ECIR 2024.☆23Nov 9, 2025Updated 4 months ago
- An open-source handbook of applied guidance and tools for sustainable software development and maintenance.☆23Mar 13, 2026Updated last week
- ☆38Apr 17, 2024Updated last year
- A collection of Zsh functions to augment Git☆19Dec 11, 2025Updated 3 months ago
- Efficient teacher-student models and scripts to make them☆55Dec 16, 2023Updated 2 years ago
- Extracts plain text, language identification and more metadata from WARC records☆23Oct 1, 2025Updated 5 months ago
- ☆34Nov 22, 2021Updated 4 years ago
- An OpenAI API compatible LLM inference server based on ExLlamaV2.☆25Feb 9, 2024Updated 2 years ago
- ☆28Feb 11, 2026Updated last month
- Code for our project CROWN (Conversational Passage Ranking by Reasoning over Word Networks)☆10Jan 11, 2024Updated 2 years ago
- collaborative web tool to enrich content☆12Nov 13, 2011Updated 14 years ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Jun 18, 2024Updated last year
- A corpus of diacritized Hebrew texts (טקסט מנוקד)☆11May 4, 2022Updated 3 years ago
- A polite and user-friendly downloader for Common Crawl data☆71Mar 3, 2026Updated 2 weeks ago
- Fast Neural Machine Translation in C++ - development repository☆23May 12, 2024Updated last year
- ☆134Jan 22, 2026Updated last month
- Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numba☆37Oct 16, 2025Updated 5 months ago
- A parallel evaluation data set of SAP software documentation with document structure annotation☆14Jul 30, 2025Updated 7 months ago
- COMET for African languages☆11Jan 24, 2025Updated last year
- A simple semi-supervised approach for creating huggingface data script loaders and upload to the hub.☆11Jun 23, 2024Updated last year
- All code and content for my blog.☆15Sep 23, 2018Updated 7 years ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆36Feb 5, 2026Updated last month
- ☆32Mar 30, 2023Updated 2 years ago
- ☆37Nov 15, 2023Updated 2 years ago
- The pipeline for the OSCAR corpus☆176Nov 9, 2025Updated 4 months ago
- Micro-framework for publishing linked data☆11Aug 1, 2017Updated 8 years ago
- IAI Style Guide☆11Jun 27, 2025Updated 8 months ago
- Library and command line utility to do approximate string matching of a source against a bitext index and get matched source and target.☆51Apr 22, 2025Updated 10 months ago
- An Easy Annotation Tool for Natural Language Processing☆11May 17, 2024Updated last year
- Efficient Low-Memory Aligner☆146Jan 15, 2025Updated last year
- Formulaire en ligne qui génère une attestation de déplacement dérogatoire☆10Mar 18, 2020Updated 6 years ago
- Code and experiments for the COLING2020 paper "Conception: Multilingually-Enhanced, Human-Readable Concept Vector Representations".☆11Dec 9, 2020Updated 5 years ago
- Dataset containing Semantic Relations and Metadata, for Training and Evaluating Distributional Semantic Models in English and Mandarin Ch…☆16Aug 7, 2017Updated 8 years ago
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆47Jul 25, 2023Updated 2 years ago
- ☆14Apr 18, 2020Updated 5 years ago
- BabelNet (and WordNet) sense embedding trained with Word2Vec and FastText☆10Sep 3, 2019Updated 6 years ago