A Python toolkit for file processing, text cleaning and data splitting. 文件处理,文本清洗和数据划分的python工具包。
☆36Oct 18, 2022Updated 3 years ago
Alternatives and similar repositories for Takin
Users that are interested in Takin are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repository contains datasets (including testing set) for EMNLP-IJCNLP 2019 paper "BiPaR: A Bilingual Parallel Dataset for Multilingu…☆23Jul 13, 2021Updated 4 years ago
- Simple Transformers四种任务(分类、命名实体识别、机器阅读理解、语言模型微调)的代码样例,可以切换多种预训练模型。☆23Jun 7, 2022Updated 3 years ago
- Source code and dataset for the paper "GECOR: An End-to-End Generative Ellipsis and Co-reference Resolution Model for Task-Oriented Dialo…☆30Jul 22, 2023Updated 2 years ago
- MNBVC项目-ShareGPT语料清洗☆16Oct 4, 2023Updated 2 years ago
- 开源QG系统(Question Generation,问题生成),基于Pytorch和Transformer编写☆55Jul 25, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Extract Chinese/English QA Data from WikiHow pages.☆16May 21, 2023Updated 2 years ago
- 基于中文 GPT2 预训练模型的语句困惑度计算☆15Apr 20, 2023Updated 3 years ago
- NLP预/后处理工具。☆30Mar 31, 2025Updated last year
- code for ACL 2019 paper "cross lingual training for automatic question generation"☆14Jun 30, 2019Updated 6 years ago
- 基于豆瓣电影打分的评论文本分类,使用tf-idf/word2vec/bert方法构造词向量,利用svm和逻辑回归模型进行分类☆18Jan 8, 2022Updated 4 years ago
- Advancing Spatial-Temporal Rock Fracture Prediction with Virtual Camera-Based Data Augmentation☆12Jan 19, 2025Updated last year
- 微博舆情与用户行为可视化平台☆23Mar 27, 2023Updated 3 years ago
- The wizard of oz code used for collecting goal-oriented dialogue systems☆13Oct 30, 2017Updated 8 years ago
- 涵盖网络爬虫、数据库、数据分析、机器学习、可视化、文本分析、GUI、自动化办公☆14Jan 14, 2022Updated 4 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- 💬 简单在线聊天室。☆11Apr 15, 2019Updated 7 years ago
- 基于知识图谱的科技查新数据分析系统针对科技报告和文献等数据进行处理和分析。 用到知识图谱/TextCNN文本分类等技术,前后端分别采用vue和springboot,数据库 采用MySQL和neo4j,结合echarts图表对分析结果进行展示。☆14Mar 28, 2025Updated last year
- Examples about using MGeo finetune models☆55Feb 9, 2023Updated 3 years ago
- LLM evaluation on 2024 Chinese Gaokao Mathematics — zero-contamination benchmark with dual prompt formats☆19Apr 15, 2026Updated 2 weeks ago
- 🎯 企业级AI助手规则体系(中文版) - 专为中国开发者打造,支持Augment、Cursor、Claude Code、Trae AI等主流AI工具的一键安装和配置☆28Aug 1, 2025Updated 9 months ago
- 基于Python爬虫技术的中国知网(CNKI)文献检索与下载程序,能够便利文献的检索与信息下载!☆18Jun 18, 2023Updated 2 years ago
- A Fast(er) and Accurate Syntactic Parsing by Exacter Searching.☆17Jul 25, 2024Updated last year
- DocQues answers queries on longer and multiple documents build on GPT-Index and GPT-3☆13Jan 1, 2023Updated 3 years ago
- 基于关键词搜索结果的微博爬虫☆31Nov 6, 2018Updated 7 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A demonstration of how to train a custom tokenizer similar to TikToken.☆15Jan 6, 2025Updated last year
- 深度学习和NLP随笔☆27Jun 17, 2019Updated 6 years ago
- 中山大学自然语言处理项目:中文分词(序列标注/命名实体识别)。Keras实现,BiLSTM+CRF框架。☆18Jan 30, 2021Updated 5 years ago
- A Large-Scale Dataset for Long Text and Multi-Table Summarization☆18Feb 21, 2024Updated 2 years ago
- A fully featured piano with multiplayer support☆15Jul 26, 2025Updated 9 months ago
- When hosted with heroku, it can be used to proxy ssl url to your fb application.☆13Jan 29, 2015Updated 11 years ago
- OKX API Interface Resender Server☆19Feb 28, 2024Updated 2 years ago
- Materials for AACL-IJCNLP-2022 tutorial: Efficient and Robust Knowledge Graph Construction☆28Feb 3, 2023Updated 3 years ago
- The source code of the paper 'Dynamic Knowledge Routing Network For Target-Guided Open-Domain Conversation'☆24Mar 24, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Sparse Multilabel Categorical Crossentropy☆11Sep 10, 2023Updated 2 years ago
- Analyzing knowledge graph embedding methods, including TransE, DistMult, CP, SimplE, ComplEx, Quaternion☆28May 23, 2023Updated 2 years ago
- Text2Neo4j 是一个遍历文档、从文本中提取关系并将其保存到 Neo4j 数据库中以形成知识图谱的工具。本项目结合了 Dify 和 LLaMA3.1(8B 模型)来高效处理和提取复杂关系。☆24Aug 31, 2024Updated last year
- pytorch版基于gpt+nezha的中文多轮Cdial☆11Oct 22, 2022Updated 3 years ago
- Facial Landmark Detection using OpenCV and Mediapipe☆12Jul 4, 2022Updated 3 years ago
- Easily deployable nyaa proxy to access nyaa from blocked regions. Using vercel rewrites☆24Nov 18, 2025Updated 5 months ago
- English-French MT dialogue dataset☆17Apr 29, 2022Updated 4 years ago