Multilingual sentence alignment using sentence embeddings
☆154May 4, 2026Updated last month
Alternatives and similar repositories for bertalign
Users that are interested in bertalign are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Improved Sentence Alignment in Linear Time and Space☆197Mar 6, 2023Updated 3 years ago
- ☆38Mar 16, 2026Updated 2 months ago
- A 2024 Reading List for Bilingual Lexicon Induction (BLI) / Word Translation. Frequently Updated.☆23Sep 29, 2024Updated last year
- Machine-Translation-based sentence alignment tool for parallel text☆314Mar 18, 2021Updated 5 years ago
- A neural word aligner based on multilingual BERT☆376Mar 10, 2022Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A accurate multilingual word aligner based on LaBSE☆24Oct 25, 2023Updated 2 years ago
- An advanced, extensible web front-end for the Manatee-open corpus search engine☆80Updated this week
- Open-source, AI-enhanced CAT tool with multi-LLM support, translation memory, glossary management, 'Superlookup' concordance across TMs/g…☆41Updated this week
- A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.☆25Oct 27, 2023Updated 2 years ago
- Sentence aligner☆129May 21, 2021Updated 5 years ago
- Bicleaner fork that uses neural networks☆40Feb 23, 2026Updated 3 months ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Jun 18, 2024Updated last year
- Official implementations for (1) BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation and (2) Discourse Centric …☆84Sep 21, 2023Updated 2 years ago
- ☆11Oct 14, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Translate5: Open Source Translation System (published 1st time on github at 2020-08-10)☆51May 28, 2026Updated 2 weeks ago
- ☆28Jul 30, 2024Updated last year
- An implementation of data augmentation methods for natural language processing tasks.☆13Jul 25, 2024Updated last year
- Improving cross-lingual word embeddings by meeting in the middle☆23Aug 25, 2020Updated 5 years ago
- Automatic Idiomatic Expression Detection☆13Sep 26, 2021Updated 4 years ago
- Linguistically analyzed Classical Tibetan texts☆29Jun 30, 2021Updated 4 years ago
- Neural CRF Model for Sentence Alignment in Text Simplification☆67Jan 19, 2025Updated last year
- An open source Translation Memory Engine written in Java☆16Dec 22, 2022Updated 3 years ago
- Bias correction for richness in abundance data☆13Apr 20, 2026Updated last month
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Implementation of our paper "Exploiting Unsupervised Data for Emotion Recognition in Conversations" in the Findings of EMNLP-2020.☆13Nov 17, 2020Updated 5 years ago
- Reasoning-based Evaluation and Ranking of Translations.☆20Jun 2, 2026Updated last week
- official code for "EgoVSR: Towards High-Quality Egocentric Video Super-Resolution"☆15Jul 26, 2023Updated 2 years ago
- We release a dataset based on Wikipedia sentences and the corresponding translations in 6 different languages along with the scores (scal…☆81Aug 31, 2021Updated 4 years ago
- Automatic text comparison with an extendable variance classifier☆13Sep 11, 2023Updated 2 years ago
- Neural Network approaches for the Traveling Salesman Problem☆10Apr 20, 2021Updated 5 years ago
- Bilingual term extractor☆60Nov 19, 2025Updated 6 months ago
- Digital texts in Prakrit☆10Sep 14, 2025Updated 9 months ago
- Re-rank n-best lists using additional features.☆29Jun 5, 2018Updated 8 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- An Interactive Tool for Annotating Discourse Structure and Text Improvement☆16Sep 15, 2021Updated 4 years ago
- Sanskrit Scala and Java code: miscellaneous utilities, webapp☆15Jun 7, 2021Updated 5 years ago
- pyTorch-text-classification☆16Dec 6, 2017Updated 8 years ago
- German Text Embedding Clustering Benchmark☆18Mar 15, 2024Updated 2 years ago
- Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation…☆32Jan 14, 2025Updated last year
- Improving Word Translation via Two-Stage Contrastive Learning (ACL 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-…☆36Jan 23, 2025Updated last year
- Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".☆11Oct 25, 2023Updated 2 years ago