uhermjakob / wildebeestLinks
Scripts investigate, repair and normalize a wide range of text file problems at the character level.
☆21Updated 3 years ago
Alternatives and similar repositories for wildebeest
Users that are interested in wildebeest are comparing it to the libraries listed below
Sorting:
- universal tokenizer☆16Updated 4 years ago
- Efficient teacher-student models and scripts to make them☆53Updated 2 years ago
- A set of pipelines for performing experiments on various NLP tasks with a focus on resource-poor/minority languages.☆37Updated last week
- Curated list of open source and openly accessible large language models☆25Updated 2 years ago
- Translation demonstrator☆35Updated 5 years ago
- Bilingual sentence similarity classifier using Tensorflow☆24Updated 6 years ago
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆54Updated 2 months ago
- Open information and community for machine translation☆80Updated last month
- OpusFilter - Parallel corpus processing toolkit☆113Updated last week
- Multilingual sentence alignment using sentence embeddings☆133Updated last year
- ☆26Updated last year
- A TextTiling-based algorithm for text segmentation (aka topic segmentation) that uses neural sentence encoders, as well as extractive sum…☆51Updated 2 years ago
- GenieNLP: A versatile codebase for any NLP task☆88Updated last year
- ☆57Updated 3 years ago
- Fast Neural Machine Translation in C++ - development repository☆22Updated last year
- Code for OpenAI Whisper Web App Demo☆93Updated 3 years ago
- 🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.☆56Updated 3 years ago
- This public GitHub repository contains code for a fully self-hosted, on-premise transcription solution.☆53Updated last year
- Official implementation of the paper "CoEdIT: Text Editing by Task-Specific Instruction Tuning" (EMNLP 2023)☆134Updated last year
- Translate HTML using Argos Translate☆54Updated 2 years ago
- ☆80Updated 2 weeks ago
- Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations.☆49Updated 3 years ago
- Neural Machine Translation (NMT) tutorial. Data preprocessing, model training, evaluation, and deployment.☆173Updated this week
- A list of awesome Machine Translation frameworks, libraries, software and papers☆195Updated last year
- Efficient Low-Memory Aligner☆146Updated 11 months ago
- A sentence segmentation library with wide language support optimized for speed and utility.☆76Updated 2 weeks ago
- Library and command line utility to do approximate string matching of a source against a bitext index and get matched source and target.☆50Updated 8 months ago
- This repository contains code for the paper "Meet Your Favorite Character: Open-domain Chatbot Mimicking Fictional Characters with only a…☆13Updated 3 years ago
- MILES is a multilingual text simplifier inspired by LSBert - A BERT-based lexical simplification approach proposed in 2018. Unlike LSBert…☆50Updated 4 years ago
- Command-line script for inferencing from models such as LLaMA, in a chat scenario, with LoRA adaptations☆33Updated 2 years ago