A accurate multilingual word aligner based on LaBSE
☆24Oct 25, 2023Updated 2 years ago
Alternatives and similar repositories for AccAlign
Users that are interested in AccAlign are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This is the code for neural-Jacana aligner, and the data for MultiMWA dataset.☆20Feb 12, 2023Updated 3 years ago
- Code for paper ”Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability“☆15Jun 13, 2023Updated 2 years ago
- Yet another Python binding for Juman++/KNP/KWJA☆39Updated this week
- Experiments for XLM-V Transformers Integeration☆13Feb 8, 2023Updated 3 years ago
- NanGe - A Rule-based Chinese-English Machine Translation System☆20Jul 23, 2017Updated 8 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Code for our paper "Mask-Align: Self-Supervised Neural Word Alignment" in ACL 2021☆61May 10, 2021Updated 4 years ago
- ☆13Apr 13, 2021Updated 5 years ago
- ☆57Dec 27, 2025Updated 4 months ago
- String Distance using cython☆13Jan 19, 2020Updated 6 years ago
- A neural word aligner based on multilingual BERT☆375Mar 10, 2022Updated 4 years ago
- SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILP☆14Mar 24, 2021Updated 5 years ago
- GLADIS: A General and Large Acronym Disambiguation Benchmark (EACL 23)☆18Jun 24, 2024Updated last year
- A different, but useful, textcat approach.☆18Jul 15, 2024Updated last year
- Repository of ACL2023 paper: Unbalanced Optimal Transport for Unbalanced Word Alignment☆38Sep 13, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆10Oct 17, 2021Updated 4 years ago
- ☆37Nov 14, 2025Updated 5 months ago
- L&S 88-5 Connector Course to Data 8☆15Apr 12, 2018Updated 8 years ago
- [ACL 2025] 🔍 Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment☆11Apr 6, 2025Updated last year
- Library for experimenting with state-of-the-art evaluation metrics like UScore☆12May 27, 2023Updated 2 years ago
- 基于中心度的中文关键短语抽取工具☆11Sep 2, 2022Updated 3 years ago
- NOAH's Corpus: Part-of-Speech Tagging for Swiss German☆12Jan 6, 2023Updated 3 years ago
- Interactive parametric benchmarks in Python☆17Apr 18, 2021Updated 5 years ago
- EWoK dataset generation framework☆12May 14, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [WIP] A TSX-like language that's safer, more functional, and compiles to JSX.☆12Jun 19, 2025Updated 10 months ago
- Swete's LXX Text from 1KY Greek with Corrections Against Manuscripts☆10Oct 11, 2020Updated 5 years ago
- ☆15Nov 20, 2025Updated 5 months ago
- Official code implementation of "Tree-based Focused Web Crawling with Reinforcement Learning" and the TRES framework☆23Feb 16, 2026Updated 2 months ago
- KG data for ODA☆12Sep 21, 2024Updated last year
- Python package to augment multilingual data☆15Feb 15, 2023Updated 3 years ago
- Python port for IWNLP.Lemmatizer☆19Apr 13, 2026Updated 2 weeks ago
- A powerful text cleaner for Japanese web texts☆12Jan 20, 2024Updated 2 years ago
- ☆19Jun 9, 2025Updated 10 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Can LLMs generate code-mixed sentences through zero-shot prompting?☆11Apr 18, 2023Updated 3 years ago
- ☆13Oct 5, 2025Updated 6 months ago
- Logical inference system based on event semantics and degree semantics in formal semantics☆10Jan 22, 2023Updated 3 years ago
- An implementation of "Subspace Representations for Soft Set Operations and Sentence Similarities" (NAACL 2024)☆10May 31, 2024Updated last year
- TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought Processes☆14Jul 1, 2025Updated 10 months ago
- Java command line tool to convert PAGE XML files with layout and text content to PDF☆10Apr 27, 2020Updated 6 years ago
- Scripts to preprocess training and test data and to run fast_align and giza☆107Nov 2, 2021Updated 4 years ago