Bicleaner fork that uses neural networks
☆40Feb 23, 2026Updated 3 months ago
Alternatives and similar repositories for bicleaner-ai
Users that are interested in bicleaner-ai are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Efficient teacher-student models and scripts to make them☆57Dec 16, 2023Updated 2 years ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Jun 18, 2024Updated last year
- ☆13Aug 23, 2024Updated last year
- Curriculum training☆22Jun 25, 2025Updated 11 months ago
- ☆38Mar 16, 2026Updated 2 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆58Feb 3, 2026Updated 3 months ago
- The code, training pipeline, and models that power Firefox Translations☆285May 18, 2026Updated last week
- Unsupervised factor-based text tokenizer for natural-language processing applications☆17Jul 24, 2020Updated 5 years ago
- Library for fast text representation and classification.☆31Jan 9, 2024Updated 2 years ago
- [EMNLP 2020] Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆395Nov 7, 2023Updated 2 years ago
- OpusFilter - Parallel corpus processing toolkit☆115May 13, 2026Updated last week
- ☆15Jun 17, 2019Updated 6 years ago
- Open language modeling toolkit based on PyTorch☆185Apr 23, 2026Updated last month
- A tool that locates, downloads, and extracts machine translation corpora☆163Apr 13, 2026Updated last month
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆13Jul 31, 2023Updated 2 years ago
- ☆28Jul 30, 2024Updated last year
- The official code for our EMNLP 2022 long paper [Breaking the Representation Bottleneck of Chinese Characters: Neural Machine Translation…☆26Sep 10, 2025Updated 8 months ago
- ☆21Feb 13, 2023Updated 3 years ago
- Improved Sentence Alignment in Linear Time and Space☆196Mar 6, 2023Updated 3 years ago
- ☆24Apr 2, 2024Updated 2 years ago
- ☆34Nov 22, 2021Updated 4 years ago
- [ACL 2025] 🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment☆11Apr 6, 2025Updated last year
- ☆14May 26, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- UzTransliterator | State-of-the-art machine transliteration tool for Uzbek language☆13Jan 6, 2026Updated 4 months ago
- A Neural Framework for MT Evaluation☆753Apr 21, 2026Updated last month
- Tools for formatting WMT hypothesis and test sets in XML☆27Apr 18, 2025Updated last year
- COrpus based Morphological Analyzer with INtegrated User dictionary☆21Mar 30, 2025Updated last year
- ☆13Jan 17, 2024Updated 2 years ago
- SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILP☆14Mar 24, 2021Updated 5 years ago
- Bilingual lexicons map words in one language to their translations in another, and are typically induced by learning linear project…☆18Jun 1, 2021Updated 4 years ago
- Bitextor generates translation memories from multilingual websites☆299Nov 11, 2024Updated last year
- Python binding for the SFML library, using pybind11☆14Sep 27, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Inference slice of marian for bergamot's tiny11 models. Faster to compile, and wield. Fewer model-archs than bergamot-translator.☆14Oct 24, 2024Updated last year
- CSE201 Objected-Oriented Programming in C++: Teach an AI to produce pieces of music☆12Jan 23, 2019Updated 7 years ago
- ☆20Oct 22, 2021Updated 4 years ago
- ☆14Jan 4, 2021Updated 5 years ago
- Loopback web application for administration of Datawake networks☆10May 2, 2017Updated 9 years ago
- Accepted to ICLR 2025. MetaMetrics is a calibrated meta-metric designed to evaluate generation tasks across different modalities aligned …☆15Dec 30, 2024Updated last year
- Aldebaran is a cross-platform (Discord and Revolt) multi-purposes bot which offers useful features to DiscordRPG players along with many …☆11Jan 9, 2024Updated 2 years ago