zaemyung/sentsplit

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zaemyung/sentsplit)

zaemyung / sentsplit

A flexible sentence segmentation library using CRF model and regex rules

☆32

Alternatives and similar repositories for sentsplit

Users that are interested in sentsplit are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zaemyung / streamlit-tutorial
View on GitHub
A simple tutorial script on Streamlit using the Iris Dataset
☆13Sep 13, 2023Updated 2 years ago
passing2961 / EmoNSMC
View on GitHub
Korean large emotion labeled dataset (EmoNSMC)
☆14Mar 5, 2020Updated 6 years ago
passing2961 / PersonaChatGen
View on GitHub
🎭 Official code and dataset for our CCGPK@COLING 2022 paper - "PersonaChatGen: Generating Personalized Dialogue using GPT-3"
☆13Mar 26, 2024Updated 2 years ago
minnesotanlp / cobbler
View on GitHub
Code and data for Koo et al's ACL 2024 paper "Benchmarking Cognitive Biases in Large Language Models as Evaluators"
☆23Feb 16, 2024Updated 2 years ago
passing2961 / Stark
View on GitHub
Official code and dataset for our EMNLP 2024 Findings paper: Stark: Social Long-Term Multi-Modal Conversation with Persona Commonsense Kn…
☆19Dec 27, 2024Updated last year
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
zaemyung / crawl-reuters
View on GitHub
A simple Scrapy script for crawling Reuters news articles (Python 3)
☆14Jan 17, 2018Updated 8 years ago
jonghwanhyeon / overwatch-stats
View on GitHub
A Python library to query a player's overwatch stats from Battle.net
☆13Nov 12, 2018Updated 7 years ago
chaojiang06 / arXivEdits
View on GitHub
Data for EMNLP 2022 paper "arXivEdits: Understanding the Human Revision Process in Scientific Writing".
☆14Sep 30, 2023Updated 2 years ago
vipulraheja / iterater
View on GitHub
Official implementation of the paper "IteraTeR: Understanding Iterative Revision from Human-Written Text" (ACL 2022)
☆83Nov 15, 2023Updated 2 years ago
passing2961 / KMRE
View on GitHub
Korean Moview Review Emotion (KMRE) Dataset
☆21Sep 7, 2020Updated 5 years ago
clovaai / lookwhostalking
View on GitHub
Look Who’s Talking: Active Speaker Detection in the Wild
☆76Aug 24, 2023Updated 2 years ago
QuoQA-NLP / Ko-conceptual-captions
View on GitHub
Google's Conceptual Captions Dataset translated into Korean
☆23Aug 28, 2022Updated 3 years ago
jonghwanhyeon / python-mecab-ko
View on GitHub
A python binding for mecab-ko
☆111Jul 14, 2024Updated 2 years ago
songys / 2021Langcon
View on GitHub
☆11Oct 3, 2021Updated 4 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
ko-nlp / moducorpus-sanitizer
View on GitHub
모두의 말뭉치 데이터를 분석에 편리한 형태로 변환하는 기능을 제공합니다.
☆11Mar 2, 2022Updated 4 years ago
tunib-ai / artwork_captions
View on GitHub
Machine Generated Captions for Best Artworks
☆22Sep 21, 2022Updated 3 years ago
AIRC-KETI / Korean-Copora
View on GitHub
☆14Dec 9, 2021Updated 4 years ago
dlfrnaos19 / tpu-starter-korean
View on GitHub
☆10Oct 21, 2022Updated 3 years ago
JoungheeKim / kor-spacing
View on GitHub
This is project for korean auto spacing
☆12Aug 3, 2020Updated 5 years ago
disrpt / sharedtask2021
View on GitHub
Repository for DISRPT2021 shared task
☆16Sep 5, 2022Updated 3 years ago
Open-Galapagos / evolution-fine-tuning
View on GitHub
Official code, models, and dataset for "Evolution Fine-Tuning (EFT): Learning to Discover Across 371 Optimization Tasks"
☆25Jun 30, 2026Updated 3 weeks ago
jeongukjae / korean-wikipedia-corpus
View on GitHub
문장단위로 분절된 한국어 위키피디아 코퍼스. Releases에서 다운로드 받거나 tfds-korean으로 사용해주세요.
☆24Sep 6, 2023Updated 2 years ago
TurboNLP / Translate-Demo
View on GitHub
A Translation Task using TurboTransformers
☆10Dec 17, 2020Updated 5 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
kakaobrain / autowu
View on GitHub
Official repository for Automated Learning Rate Scheduler for Large-Batch Training (8th ICML Workshop on AutoML)
☆39Dec 3, 2021Updated 4 years ago
UCSB-NLP-Chang / Prereq_tune
View on GitHub
Implementation for the paper "Fictitious Synthetic Data Can Improve LLM Factuality via Prerequisite Learning"
☆11Jan 10, 2025Updated last year
google-research-datasets / PropSegmEnt
View on GitHub
PropSegmEnt is an annotated dataset for segmenting English text into propositions, and recognizing proposition-level entailment relations…
☆21Dec 21, 2022Updated 3 years ago
Priya22 / pdnc-lrec2022
View on GitHub
Repo for the LREC 2022 paper The Project Dialogism Novel Corpus: A Dataset for Quotation Attribution in Literary Texts.
☆14Jul 27, 2022Updated 4 years ago
dykang / xslue
View on GitHub
ACL 2021 paper "Style is NOT a single variable: Case Studies for Cross-Style Language Understanding " by Dongyeop Kang and Eduard Hovy
☆15Jul 19, 2021Updated 5 years ago
cofe-ai / fast-gector
View on GitHub
☆63Aug 2, 2023Updated 2 years ago
triplet02 / KoNPron
View on GitHub
Convert Numerical Representations to Korean Pronunciation
☆14Apr 20, 2020Updated 6 years ago
passing2961 / EmpGPT-3
View on GitHub
Official code for our COLING 2022 paper: In-Context Learning for Empathetic Dialogue Generation
☆20Mar 1, 2023Updated 3 years ago
lium-lst / wmt17-mmt
View on GitHub
Data and code for replicating WMT17 Multimodal Translation results
☆16Oct 10, 2018Updated 7 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
LauraRuis / do-pigs-fly
View on GitHub
☆22Oct 22, 2023Updated 2 years ago
blcuicall / litmind-dictionary
View on GitHub
An open-source online generative dictionary
☆13May 29, 2022Updated 4 years ago
BitnaKeum / Web_Crawler
View on GitHub
나무위키, 위키피디아, 다음블로그, 티스토리, 유튜브, 네이트판 크롤러
☆13Feb 20, 2026Updated 5 months ago
Data-Intelligence-Lab / DEFT-korean-alpaca
View on GitHub
☆23Oct 30, 2023Updated 2 years ago
monologg / ko_lm_dataformat
View on GitHub
A utility for storing and reading files for Korean LM training 💾
☆35Jul 18, 2026Updated last week
uthree / ddsp-vocoder
View on GitHub
☆12Nov 7, 2024Updated last year
Waffle-Liu / STRODE
View on GitHub
STRODE: Stochastic Boundary Ordinary Differential Equation
☆13Jul 20, 2021Updated 5 years ago