Multilingual sentence alignment using sentence embeddings
☆146Nov 4, 2024Updated last year
Alternatives and similar repositories for bertalign
Users that are interested in bertalign are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Improved Sentence Alignment in Linear Time and Space☆193Mar 6, 2023Updated 3 years ago
- ☆37Mar 16, 2026Updated 3 weeks ago
- A 2024 Reading List for Bilingual Lexicon Induction (BLI) / Word Translation. Frequently Updated.☆23Sep 29, 2024Updated last year
- Machine-Translation-based sentence alignment tool for parallel text☆315Mar 18, 2021Updated 5 years ago
- A neural word aligner based on multilingual BERT☆374Mar 10, 2022Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Code for our paper in ACL 2017☆13Dec 14, 2017Updated 8 years ago
- A accurate multilingual word aligner based on LaBSE☆24Oct 25, 2023Updated 2 years ago
- A fully-fledge PyTorch package for Morphological Analysis, tailored to morphologically rich and historical languages.☆24Oct 27, 2023Updated 2 years ago
- Sentence aligner☆126May 21, 2021Updated 4 years ago
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆393Nov 7, 2023Updated 2 years ago
- Bicleaner fork that uses neural networks☆40Feb 23, 2026Updated last month
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Jun 18, 2024Updated last year
- Official implementations for (1) BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation and (2) Discourse Centric …☆83Sep 21, 2023Updated 2 years ago
- State-of-the-art LLM-based translation models.☆582Apr 9, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Translate5: Open Source Translation System (published 1st time on github at 2020-08-10)☆49Apr 8, 2026Updated last week
- Adaptive Machine Translation with Large Language Models☆31Jan 4, 2025Updated last year
- Linguistically analyzed Classical Tibetan texts☆28Jun 30, 2021Updated 4 years ago
- Neural CRF Model for Sentence Alignment in Text Simplification☆67Jan 19, 2025Updated last year
- Transcription corpora for training HTR models for medieval manuscripts from the 12th to the 15th century.☆25Jan 17, 2025Updated last year
- Bias correction for richness in abundance data☆12Mar 28, 2026Updated 2 weeks ago
- official code for "EgoVSR: Towards High-Quality Egocentric Video Super-Resolution"☆15Jul 26, 2023Updated 2 years ago
- Neural Network approaches for the Traveling Salesman Problem☆10Apr 20, 2021Updated 4 years ago
- Bilingual term extractor☆59Nov 19, 2025Updated 4 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- An Interactive Tool for Annotating Discourse Structure and Text Improvement☆16Sep 15, 2021Updated 4 years ago
- TER-plus Machine Translation metric.☆31May 23, 2022Updated 3 years ago
- Text Re-use Alignment Visualization☆38Nov 8, 2017Updated 8 years ago
- This repository will soon contain all scripts and links to the annotated corpora of Tibetan.☆14Feb 4, 2025Updated last year
- A Fast Image Converter thats supports common image formats. It's using WebAssembly for all conversions so no image is sent to the server…☆11Jul 10, 2025Updated 9 months ago
- Sanskrit Scala and Java code: miscellaneous utilities, webapp☆15Jun 7, 2021Updated 4 years ago
- React application using Segment Anything in browser☆10Oct 9, 2023Updated 2 years ago
- Sanskrit Tibetan Parallel Dataset☆11Jul 2, 2025Updated 9 months ago
- Code and data for the IWSLT 2022 shared task on Formality Control for SLT☆22May 24, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Improving Word Translation via Two-Stage Contrastive Learning (ACL 2022). Keywords: Bilingual Lexicon Induction, Word Translation, Cross-…☆36Jan 23, 2025Updated last year
- This project implements the interaction between two dotnet core services through mTLS.☆15Jan 28, 2024Updated 2 years ago
- Code for ACL 2023 main conference paper "Back Translation for Speech-to-text Translation Without Transcripts".☆12Oct 25, 2023Updated 2 years ago
- data, metadata, tools, and LDA experiments on a corpus of Sanskrit philosophy texts☆12Nov 28, 2021Updated 4 years ago
- Official repository for U-SAM (Interspeech 2025)☆26Jun 3, 2025Updated 10 months ago
- A software to detect text reuse with BLAST.☆13Oct 8, 2019Updated 6 years ago
- Witwicky: An implementation of Transformer in PyTorch.☆22Aug 17, 2020Updated 5 years ago