DoodleBears / split-lang
✨ Split text by languages (e.g. 你喜欢看アニメ吗 -> 你喜欢看 | アニメ | 吗) for NLP tasks (e.g. parse, TTS). Powered by fasttext and budoux
☆48Updated 3 weeks ago
Alternatives and similar repositories for split-lang:
Users that are interested in split-lang are comparing it to the libraries listed below
- The inference code of RVC-Boss/GPT-SoVITS that can be developer-friendly.☆15Updated 5 months ago
- ⚡️ 80x faster Fasttext language detection out of the box | Split text by language☆174Updated 2 weeks ago
- A streamlined, user-friendly JSON streaming preprocessor, crafted in Python.☆97Updated 6 months ago
- Tool to allow parsing large JSON files without laoding into memory. Developed in Rust with adapters in other programming langauges for ea…☆28Updated last year
- ☆89Updated last month
- Archived 🚧|🌻Building ChatBot with LLMs.🌻 | Using async requests. | 具有多 LLM 适应性 | 通用大语言模型代理端框架 |多人称全类型注解☆39Updated last year
- Chrome extension to add a link from each Arxiv page to the corresponding HF Paper page☆25Updated last year
- A lightweight end-to-end text-to-speech model☆110Updated 3 weeks ago
- Evaluation for AI apps and agent☆36Updated last year
- ☆18Updated 11 months ago
- 🔧 Repair JSON!Solution for JSON Anomalies from LLMs.☆227Updated 8 months ago
- A transformer-based multimodal model for music.☆28Updated 7 months ago
- ☆32Updated last year
- 用文本编辑器剪视频☆37Updated last year
- We Speech Transcript based on LLM, in 300 lines of code.☆149Updated 2 weeks ago
- Turn any OCR models into online inference API endpoint 🚀 🌖☆54Updated this week
- A unified interface for multiple Text-to-Speech (TTS) providers.☆261Updated 2 months ago
- A cloudflare worker that implements the vless server☆17Updated last month
- Grapheme-to-Phoneme lexicons for Chinese dialects☆67Updated 2 years ago
- SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime☆85Updated 5 months ago
- 用于SenseVoice的api项目,输出带时间戳字幕☆34Updated 4 months ago
- A simple, easy-to-use streaming json preprocessor.☆58Updated 3 months ago
- Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.☆9Updated 8 months ago
- A simple svs labeling tool☆12Updated 7 months ago
- An LLM-targeted template engine, built upon Jinja.☆11Updated 3 months ago
- The YouTube Text-To-Speech dataset is comprised of waveform audio extracted from YouTube videos alongside their English transcriptions☆51Updated 3 years ago
- 友善之臂☆24Updated 11 months ago
- 使用 pinyin-data 和 phrase-pinyin-data 中的拼音数据文件覆盖 pypinyin 中的内置拼音数据☆56Updated 2 months ago