A Python toolkit for file processing, text cleaning and data splitting. 文件处理,文本清洗和数据划分的python工具包。
☆36Oct 18, 2022Updated 3 years ago
Alternatives and similar repositories for Takin
Users that are interested in Takin are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- simple translate☆12Mar 7, 2020Updated 6 years ago
- This repository contains datasets (including testing set) for EMNLP-IJCNLP 2019 paper "BiPaR: A Bilingual Parallel Dataset for Multilingu…☆23Jul 13, 2021Updated 4 years ago
- Simple Transformers四种任务(分类、命名实体识别、机器阅读理解、语言模型微调)的代码样例,可以切换多种预训练模型。☆23Jun 7, 2022Updated 3 years ago
- MNBVC项目-ShareGPT语料清洗☆16Oct 4, 2023Updated 2 years ago
- 开源QG系统(Question Generation,问题生成),基于Pytorch和Transformer编写☆55Jul 25, 2024Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Extract Chinese/English QA Data from WikiHow pages.☆16May 21, 2023Updated 3 years ago
- 基于中文 GPT2 预训练模型的语句困惑度计算☆15Apr 20, 2023Updated 3 years ago
- NLP预/后处理工具。☆30Mar 31, 2025Updated last year
- Usings LLM chat with knowledges☆21Aug 12, 2023Updated 2 years ago
- 使用scrapy框架爬取豆瓣影评,利用python对数据进行清洗分析,最后进行可视化☆15Sep 5, 2020Updated 5 years ago
- code for ACL 2019 paper "cross lingual training for automatic question generation"☆14Jun 30, 2019Updated 6 years ago
- Analysis codes for Laser-Induced Breakdown Spectroscopy data☆10Aug 19, 2017Updated 8 years ago
- Advancing Spatial-Temporal Rock Fracture Prediction with Virtual Camera-Based Data Augmentation☆12Jan 19, 2025Updated last year
- CamRest676 is an English data set, I translate it into Chinese for training nlu.☆12Dec 20, 2017Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The wizard of oz code used for collecting goal-oriented dialogue systems☆13Oct 30, 2017Updated 8 years ago
- implement a RNN model of DSTC2 task☆16Jan 25, 2019Updated 7 years ago
- 涵盖网络爬虫、数据库、数据分析、机器学习、可视化、文本分析、GUI、自动化办公☆14Jan 14, 2022Updated 4 years ago
- Deep learning for tissue parameter estimation in magnetic resonance fingerprinting☆13Oct 5, 2019Updated 6 years ago
- 基于知识图谱的科技查新数据分析系统针对科技报告和文献等数据进行处理和分析。 用到知识图谱/TextCNN文本分类等技术,前后端分别采用vue和springboot,数据库 采用MySQL和neo4j,结合echarts图表对分析结果进行展示。☆14Mar 28, 2025Updated last year
- 微博话题简单分析,话题爬取、高频词获取、词云生成、情感值获取,python + selenium + jieba + snownlp + wordcloud☆33Jan 28, 2021Updated 5 years ago
- This repository contains code and models for the paper: Semantic Graphs for Generating Deep Questions (ACL 2020).☆65Jan 20, 2024Updated 2 years ago
- 基于Python爬虫技术的中国知网(CNKI)文献检索与下载程序,能够便利文献的检索与信息下载!☆18Jun 18, 2023Updated 2 years ago
- A Fast(er) and Accurate Syntactic Parsing by Exacter Searching.☆17Jul 25, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- DocQues answers queries on longer and multiple documents build on GPT-Index and GPT-3☆13Jan 1, 2023Updated 3 years ago
- A demonstration of how to train a custom tokenizer similar to TikToken.☆15Jan 6, 2025Updated last year
- 深度学习和NLP随笔☆27Jun 17, 2019Updated 6 years ago
- 以京东评论作为数据集,使用常见的机器学习算法如KNN、SVM、逻辑回归、贝叶斯、xgboost等等算法进行分类。使用深度学习中的CNN、RNN、CNN和RNN连接、Bi-GRU、bert模型进行分类。使用fastnlp的框架搭建文本分类。☆31Jul 2, 2020Updated 5 years ago
- 飞桨常规赛:中文新闻文本标题分类9月第1名方案,分数0.9+,基于PaddleNLP通过预训练模型的微调完成新闻14分类模型的训练与优化☆19Oct 15, 2021Updated 4 years ago
- A Large-Scale Dataset for Long Text and Multi-Table Summarization☆18Feb 21, 2024Updated 2 years ago
- Batch processor to enable large content be digested by Ollama, focused around book processing and translations by default, fully, configu…☆36Oct 27, 2025Updated 6 months ago
- Python code & Cloudflare worker for Mistral-OCR☆12Mar 8, 2025Updated last year
- Toward Scalable Neural Dialogue State Tracking Model☆20Sep 23, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Top-Down BTG-based Preordering☆16Jan 14, 2016Updated 10 years ago
- ☆40Jan 3, 2023Updated 3 years ago
- Convert SVG files to GeoJSON☆11Feb 17, 2026Updated 3 months ago
- Materials for AACL-IJCNLP-2022 tutorial: Efficient and Robust Knowledge Graph Construction☆28Feb 3, 2023Updated 3 years ago
- The source code of the paper 'Dynamic Knowledge Routing Network For Target-Guided Open-Domain Conversation'☆24Mar 24, 2023Updated 3 years ago
- Analyzing knowledge graph embedding methods, including TransE, DistMult, CP, SimplE, ComplEx, Quaternion☆28May 23, 2023Updated 2 years ago
- Text2Neo4j 是一个遍历文档、从文本中提取关系并将其保存到 Neo4j 数据库中以形成知识图谱的工具。本项目结合了 Dify 和 LLaMA3.1(8B 模型)来高效处理和提取复杂关系。☆24Aug 31, 2024Updated last year