Library for fast text representation and classification.
☆31Jan 9, 2024Updated 2 years ago
Alternatives and similar repositories for fasterText
Users that are interested in fasterText are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆13Aug 23, 2024Updated last year
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)☆75Apr 1, 2025Updated last year
- Statistics on multilingual datasets☆17Jul 12, 2022Updated 3 years ago
- Bicleaner fork that uses neural networks☆40Feb 23, 2026Updated 2 months ago
- Efficient teacher-student models and scripts to make them☆57Dec 16, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Transform TMX to text☆27Nov 23, 2022Updated 3 years ago
- Extracts plain text, language identification and more metadata from WARC records☆23Apr 16, 2026Updated 2 weeks ago
- ☆34Nov 22, 2021Updated 4 years ago
- An OpenAI API compatible LLM inference server based on ExLlamaV2.☆25Feb 9, 2024Updated 2 years ago
- Meedan's Open Source Arabic/English Translation Memory☆33Nov 4, 2009Updated 16 years ago
- ☆28Feb 11, 2026Updated 2 months ago
- Code for our project CROWN (Conversational Passage Ranking by Reasoning over Word Networks)☆10Jan 11, 2024Updated 2 years ago
- A polite and user-friendly downloader for Common Crawl data☆77Updated this week
- Fast Neural Machine Translation in C++ - development repository☆23May 12, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Fast search index for SPLADE sparse retrieval models implemented in Python using Numpy and Numba☆38Oct 16, 2025Updated 6 months ago
- ☆141Apr 8, 2026Updated 3 weeks ago
- A parallel evaluation data set of SAP software documentation with document structure annotation☆15Jul 30, 2025Updated 9 months ago
- Probe TemporaryExposureKeys and Files of Exposure Notifications System in Japan a.k.a. "COCOA".☆10Sep 14, 2021Updated 4 years ago
- COMET for African languages☆11Jan 24, 2025Updated last year
- A simple semi-supervised approach for creating huggingface data script loaders and upload to the hub.☆11Jun 23, 2024Updated last year
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆38Feb 5, 2026Updated 2 months ago
- ☆37Mar 16, 2026Updated last month
- The pipeline for the OSCAR corpus☆177Nov 9, 2025Updated 5 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Micro-framework for publishing linked data☆11Aug 1, 2017Updated 8 years ago
- Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation☆15Aug 27, 2024Updated last year
- IAI Style Guide☆11Jun 27, 2025Updated 10 months ago
- シノビガミセッションサポートbot☆12Dec 19, 2025Updated 4 months ago
- Medusa combo files, Hashcat rules and dictionaries, JRT rules☆14Oct 20, 2022Updated 3 years ago
- An Easy Annotation Tool for Natural Language Processing☆11May 17, 2024Updated last year
- PyTorch implementation of NAACL 2021 paper "Multi-view Subword Regularization"☆26Jun 2, 2021Updated 4 years ago
- ☆16Oct 17, 2024Updated last year
- Code and experiments for the COLING2020 paper "Conception: Multilingually-Enhanced, Human-Readable Concept Vector Representations".☆11Dec 9, 2020Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Summaries and notes on CounterFactual Machine Learning papers☆19Dec 13, 2018Updated 7 years ago
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆47Jul 25, 2023Updated 2 years ago
- ☆14Apr 18, 2020Updated 6 years ago
- BabelNet (and WordNet) sense embedding trained with Word2Vec and FastText☆10Sep 3, 2019Updated 6 years ago
- DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate…☆52Jun 12, 2020Updated 5 years ago
- Datalog engine based on DuckDB☆10Mar 8, 2023Updated 3 years ago
- A Workbench for Autograding Retrieve/Generate Systems☆15Jun 30, 2025Updated 9 months ago