Tanh-wink / CrawlView external linksLinks
Use multi-threaded crawler to crawl the idiom data
☆14Dec 11, 2020Updated 5 years ago
Alternatives and similar repositories for Crawl
Users that are interested in Crawl are comparing it to the libraries listed below
Sorting:
- python class for elasticsearch , including add, batch add, update, delete, query, and scan query. also with a demo that put Wikipedia in…☆17Sep 3, 2022Updated 3 years ago
- A based-bert baseline for Chinese idiom cloze test with pytorch.☆18Dec 24, 2020Updated 5 years ago
- tf-idf 模型封装类,包含计算所有文档的tf-idf值,实现了基于tf-idf搜索引擎功能。根据query,计算与每个文档的相似度,返回与query相似度最高的topk文档☆16Nov 20, 2020Updated 5 years ago
- semantic similarity, word2vec + wmd, bert+wmd, pytorch☆31Jan 29, 2024Updated 2 years ago
- Datafountain-Epidemic government affairs quiz assistant competition. We divided this task into two parts: document retrieval and answer e…☆14Aug 21, 2022Updated 3 years ago
- DataFountain 疫情政务问答助手解决方案分享☆16May 2, 2020Updated 5 years ago
- 文档记录☆15Mar 16, 2021Updated 4 years ago
- ChineseBert用于中文拼写纠错☆43Mar 14, 2023Updated 2 years ago
- 中文大语言模型评测第二期☆71Oct 23, 2023Updated 2 years ago
- MEASURING MASSIVE MULTITASK CHINESE UNDERSTANDING☆89Mar 24, 2024Updated last year
- 基于capsule的观点型阅读理解模型☆88Aug 8, 2019Updated 6 years ago
- 科赛网-莱斯杯:全国第二届“军事智能机器阅读”挑战赛 前十团队PPT文档代码总结☆132Feb 5, 2020Updated 6 years ago
- 法研杯2019 阅读理解赛道 top3☆151Nov 13, 2023Updated 2 years ago
- Neural word segmentation with rich pretraining, code for ACL 2017 paper☆164Jan 10, 2019Updated 7 years ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Jun 18, 2024Updated last year
- ☆368Jul 19, 2023Updated 2 years ago
- Reject complicated operations for incorporating lexicon for Chinese NER.☆437Jan 22, 2022Updated 4 years ago
- KgCLUE: 大规模中文开源知识图谱问答☆455Jul 5, 2022Updated 3 years ago
- 以词为基本单位的中文BERT☆474Nov 18, 2021Updated 4 years ago
- MuCGEC中文纠错数据集及文 本纠错SOTA模型开源;Code & Data for our NAACL 2022 Paper "MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Gr…☆563Jun 9, 2023Updated 2 years ago
- An implementation of TransE and its extended models for Knowledge Representation Learning on TensorFlow☆513Nov 3, 2022Updated 3 years ago
- [ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization☆713Aug 13, 2024Updated last year
- A full Python Implementation of the ROUGE Metric (not a wrapper)☆715Nov 19, 2024Updated last year
- Four word embedding models implemented in Python. Supporting arbitrary context features☆848Aug 22, 2019Updated 6 years ago
- FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.☆1,011Sep 4, 2024Updated last year
- Must-read papers on Machine Reading Comprehension☆891Jul 9, 2020Updated 5 years ago
- LongBench v2 and LongBench (ACL 25'&24')☆1,093Jan 15, 2025Updated last year
- A Tensorflow implementation of QANet for machine reading comprehension☆983May 30, 2018Updated 7 years ago
- A collection of open-source dataset to train instruction-following LLMs (ChatGPT,LLaMA,Alpaca)☆1,143Jan 4, 2024Updated 2 years ago
- Bi-directional Attention Flow (BiDAF) network is a multi-stage hierarchical process that represents context at different levels of granul…☆1,540May 31, 2023Updated 2 years ago
- Shared repository for open-sourced projects from the Google AI Language team.☆1,746Feb 5, 2026Updated last week
- 一个用于提取简体中文字符串中省,市和区并能够进行映射,检验和简单绘图的python模块☆1,776Mar 19, 2024Updated last year
- Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]☆1,812Jul 27, 2025Updated 6 months ago
- GPT2 for Multiple Languages, including pretrained models. GPT2 多语言支持, 15亿参数中文预训练模型☆1,705May 22, 2023Updated 2 years ago
- ⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Pl…☆2,174Oct 8, 2024Updated last year
- Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models☆2,144Jun 2, 2025Updated 8 months ago
- 收录NLP竞赛策略实现、各任务baseline、相关竞赛经验贴(当前赛事、往期赛事、训练赛)、NLP会议时间、常用自媒体、GPU推荐等,持续更新中☆2,239Aug 29, 2023Updated 2 years ago
- 中文医学NLP公开资源整理:术语集/语料库/词向量/预训练模型/知识图谱/命名实体识别/QA/信息抽取/模型/论文/etc☆2,534Jan 17, 2024Updated 2 years ago
- [Medical_NLP ➟ Awesome-AI4Med] medical-related LLMs, Multimodal systems, Datasets, Benchmarks, and more.☆2,516Feb 6, 2026Updated last week