A Python toolkit for file processing, text cleaning and data splitting. 文件处理,文本清洗和数据划分的python工具包。
☆36Oct 18, 2022Updated 3 years ago
Alternatives and similar repositories for Takin
Users that are interested in Takin are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- simple translate☆12Mar 7, 2020Updated 6 years ago
- This repository contains datasets (including testing set) for EMNLP-IJCNLP 2019 paper "BiPaR: A Bilingual Parallel Dataset for Multilingu…☆23Jul 13, 2021Updated 4 years ago
- Simple Transformers四种任务(分类、命名实体识别、机器阅读理解、语言模型微调)的代码样例,可以切换多种预训练模型。☆23Jun 7, 2022Updated 3 years ago
- Source code and dataset for the paper "GECOR: An End-to-End Generative Ellipsis and Co-reference Resolution Model for Task-Oriented Dialo…☆30Jul 22, 2023Updated 2 years ago
- MNBVC项目-ShareGPT语料清洗☆16Oct 4, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- 开源QG系统(Question Generation,问题生成),基于Pytorch和Transformer编写☆55Jul 25, 2024Updated last year
- Extract Chinese/English QA Data from WikiHow pages.☆16May 21, 2023Updated 2 years ago
- NLP预/后处理工具。☆30Mar 31, 2025Updated last year
- The QA datasets used for DrQA evaluation.☆14Nov 30, 2018Updated 7 years ago
- Usings LLM chat with knowledges☆21Aug 12, 2023Updated 2 years ago
- 豆瓣爬虫|知乎爬虫|马蜂窝|猫途鹰|推特等相关爬虫☆24Dec 13, 2017Updated 8 years ago
- Advancing Spatial-Temporal Rock Fracture Prediction with Virtual Camera-Based Data Augmentation☆12Jan 19, 2025Updated last year
- The wizard of oz code used for collecting goal-oriented dialogue systems☆13Oct 30, 2017Updated 8 years ago
- implement a RNN model of DSTC2 task☆16Jan 25, 2019Updated 7 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- 涵盖网络爬虫、数据库、数据分析、机器学习、可视化、文本分析、GUI、自动化办公☆14Jan 14, 2022Updated 4 years ago
- 基于知识图谱的科技查新数据分析系统针对科技报告和文献等数据进行处理和分析。 用到知识图谱/TextCNN文本分类等技术,前后端分别采用vue和springboot,数据库 采用MySQL和neo4j,结合echarts图表对分析结果进行展示。☆14Mar 28, 2025Updated last year
- 中文大语言模型评测2024高考数学专题☆19Jun 14, 2024Updated last year
- 🏆🏆 「大模型」All in one & All from scratch. 🌍🌍 收集、清洗数据,训练Tokenizer,预训练、SFT、GRPO!☆56Aug 12, 2025Updated 8 months ago
- This repository contains code and models for the paper: Semantic Graphs for Generating Deep Questions (ACL 2020).☆65Jan 20, 2024Updated 2 years ago
- Stochastic Answer Networks (SAN) for Machine Reading Comprehension☆149Nov 26, 2018Updated 7 years ago
- 基于Python爬虫技术的中国知网(CNKI)文献检索与下载程序,能够便利文献的检索与信息下载!☆17Jun 18, 2023Updated 2 years ago
- A Fast(er) and Accurate Syntactic Parsing by Exacter Searching.☆17Jul 25, 2024Updated last year
- DocQues answers queries on longer and multiple documents build on GPT-Index and GPT-3☆13Jan 1, 2023Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A demonstration of how to train a custom tokenizer similar to TikToken.☆15Jan 6, 2025Updated last year
- 深度学习和NLP随笔☆27Jun 17, 2019Updated 6 years ago
- 中山大学自然语言处理项目:中文分词(序列标注/命名实体识别)。Keras实现,BiLSTM+CRF框架。☆18Jan 30, 2021Updated 5 years ago
- 飞桨常规赛:中文新闻文本标题分类9月第1名方案,分数0.9+,基于PaddleNLP通过预训练模型的微调完成新闻14分类模型的训练与优化☆19Oct 15, 2021Updated 4 years ago
- A fully featured piano with multiplayer support☆15Jul 26, 2025Updated 8 months ago
- Toward Scalable Neural Dialogue State Tracking Model☆20Sep 23, 2022Updated 3 years ago
- Top-Down BTG-based Preordering☆16Jan 14, 2016Updated 10 years ago
- sailVina用于Linux的反向对接脚本☆10Feb 14, 2021Updated 5 years ago
- ☆40Jan 3, 2023Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- 基于bert的中文自然语言处理工具,包括情感分析、中文分词、词性标注、以及命名实体识别功能,并提供文本分类任务、序列标注任务、句对关系判断任务的训练与预测接口☆135Mar 13, 2019Updated 7 years ago
- Convert SVG files to GeoJSON☆11Feb 17, 2026Updated last month
- The source code of the paper 'Dynamic Knowledge Routing Network For Target-Guided Open-Domain Conversation'☆24Mar 24, 2023Updated 3 years ago
- Sparse Multilabel Categorical Crossentropy☆11Sep 10, 2023Updated 2 years ago
- pytorch版基于gpt+nezha的中文多轮Cdial☆11Oct 22, 2022Updated 3 years ago
- Facial Landmark Detection using OpenCV and Mediapipe☆12Jul 4, 2022Updated 3 years ago
- Easily deployable nyaa proxy to access nyaa from blocked regions. Using vercel rewrites☆24Nov 18, 2025Updated 4 months ago