2hip3ng / chinese-text-cleanView external linksLinks
中文文本数据清理,去url,去非中文、英文、数字字符,分词,去停用词,去空行(根据文本需求再加自定义清理)
☆17May 5, 2019Updated 6 years ago
Alternatives and similar repositories for chinese-text-clean
Users that are interested in chinese-text-clean are comparing it to the libraries listed below
Sorting:
- Spark—Python学习笔记☆11Sep 25, 2018Updated 7 years ago
- Recent papers on Graph Neural Networks-based Recommender System.☆12Aug 21, 2023Updated 2 years ago
- Scraped reviews from OpenRice for sentiment analysis. Formatted to use with BERT.☆11Apr 9, 2020Updated 5 years ago
- ☆11Apr 10, 2019Updated 6 years ago
- ☆12Dec 22, 2020Updated 5 years ago
- The source code of "Deep attention diffusion graph neural networks for text classification"☆13Nov 11, 2023Updated 2 years ago
- Python cffi binding to CppJieba☆15Sep 15, 2020Updated 5 years ago
- MCP agent/client/server implementation for private knowledge base☆21May 19, 2025Updated 8 months ago
- 文本处理相关库,目前包括新词发现、字符串匹配等功能。☆15Jul 6, 2021Updated 4 years ago
- A span-based joint named entity recognition (NER) and relation extraction model.☆11Aug 5, 2020Updated 5 years ago
- Any Stream to Reinforcement Learning Environment (Time Series Data, Stock Market )☆11Oct 10, 2018Updated 7 years ago
- In this project, we need to find out commercial products listed on Google that refer to the same entity across Amazon by comparing the si…☆11Nov 7, 2016Updated 9 years ago
- Utils for mapping dataclass fields to dictionary keys, making it possible to create an instance of a dataclass from a dictionary.☆14Jun 22, 2023Updated 2 years ago
- A structured parsing technique for NER☆15May 26, 2023Updated 2 years ago
- Fast graph database in pure Python☆16Aug 31, 2021Updated 4 years ago
- ALPHA: AnomaLous Physiological Health Assessment Using Large Language Models (AI Health Summit 23)☆19Feb 25, 2025Updated 11 months ago
- Large Language Models in Molecular Embeddings☆12May 1, 2024Updated last year
- Understanding Word2Vec with Gensim and Elang (Python Packages)☆13Apr 24, 2020Updated 5 years ago
- Parsing PDF files with PDFium☆12Nov 7, 2024Updated last year
- this project is developing to crawl stock A finance and trade data from website, process finance and trade data to get factors, and then …☆17Jan 12, 2023Updated 3 years ago
- Stock investment can be one of the ways to manage one’s asset. Technical analysis is sometimes used in financial markets to assist trader…☆12Sep 30, 2020Updated 5 years ago
- An intelligent OCR to detect tables and pure text inside PDFs and obtaing a csv file and a txt from it☆15Sep 11, 2018Updated 7 years ago
- ☆13Apr 7, 2021Updated 4 years ago
- Python utils and decorators for cаching with TTL, maxsize and file-based storage☆15Oct 3, 2018Updated 7 years ago
- ☆15Feb 5, 2019Updated 7 years ago
- A general graph manipulation python module☆16Jun 2, 2009Updated 16 years ago
- Named Entity Recognition via Attention_based CNNs-BiLSTm-CRF☆15Jun 27, 2018Updated 7 years ago
- Pattern of Resume.☆17Aug 6, 2017Updated 8 years ago
- BiLSTM+CNN+CRF NER, using pytorch☆16May 26, 2019Updated 6 years ago
- Fine tuning of the Retrieval-Augmented Generation (RAG) with a custom knowledge source.☆13Feb 10, 2021Updated 5 years ago
- Part-of-Speech Tagging Models in Python☆15Oct 7, 2019Updated 6 years ago
- 使用fastNLP架构简单利用Bert-Bi-LSTM-CRF实现中文NER☆15Sep 25, 2020Updated 5 years ago
- ☆20Jul 22, 2021Updated 4 years ago
- An implementation of bidirectional LSTM-CRF for Named Entity Relationship on custom corpus with custom word embeddings☆14Apr 9, 2019Updated 6 years ago
- Source Codes of graphSEAT (CIKM'20)☆16Jan 19, 2021Updated 5 years ago
- ☆19Nov 7, 2018Updated 7 years ago
- GUI useful to manually annotate text for Named Entity Recognition purposes☆14Jun 22, 2023Updated 2 years ago
- ELMO在QA问答,文本分 类等NLP上面的应用☆15Apr 13, 2019Updated 6 years ago
- Examples using PyTorch-BigGraph☆17Jun 21, 2019Updated 6 years ago