shibing624 / similaritiesLinks
Similarities: a toolkit for similarity calculation and semantic search. 相似度计算、匹配搜索工具包,支持亿级数据文搜文、文搜图、图搜图,python3开发,开箱即用。
☆857Updated 7 months ago
Alternatives and similar repositories for similarities
Users that are interested in similarities are comparing it to the libraries listed below
Sorting:
- unified embedding model☆864Updated last year
- TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet and so on. 文本生成模型,实现了包括LLaMA,ChatGLM,BLO…☆964Updated 9 months ago
- 中文CLIP预训练模型☆415Updated 2 years ago
- 一个简单快速的分词、命名实体识别工具☆599Updated 2 months ago
- text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现了Word2Vec、RankBM25、Sentence-BERT、CoSENT等文本表征、文本相似度计算模型,开箱即用。☆4,765Updated 2 weeks ago
- pke_zh, python keyphrase extraction for chinese(zh). 中文关键词或关键句提取工具,实现了KeyBert、PositionRank、TopicRank、TextRank等算法,开箱即用。☆208Updated last year
- PaddleNLP UIE模型的PyTorch版实现☆637Updated last year
- PromptCLUE, 全中文任务支持零样本学习模型☆663Updated 2 years ago
- 中文命名实体识别。包含目前最新的中文命名实体识别论文、中文实体识别相关工具、数据集,以及中文预训练模型、词向量、实体识别综述等。☆711Updated 3 months ago
- [COLING 2022] CSL: A Large-scale Chinese Scientific Literature Dataset 中文科学文献数据集☆636Updated 2 years ago
- The online version is temporarily unavailable because we cannot afford the key. You can clone and run it locally. Note: we set defaul ope…☆821Updated last year
- chatglm多gpu用deepspeed和☆409Updated 11 months ago
- 多模态中文LLaMA&Alpaca大语言模型(VisualCLA)☆448Updated last year
- chatglm 6b finetuning and alpaca finetuning☆1,544Updated 3 months ago
- 一键中文数据增强包 ; NLP数据增强、bert数据增强、EDA:pip install nlpcda☆1,840Updated 3 months ago
- DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task。…☆709Updated 3 years ago
- 3000000+语义理解与匹配数据集。可用于无监督对比学习、半监督学习等构建中文领域效果最好的预训练模型☆298Updated 2 years ago
- An Open-sourced Knowledgable Large Language Model Framework.☆1,326Updated 5 months ago
- 雅意信息抽取大模型:在百万级人工构造的高质量信息抽取数据上进行指令微调,由中科闻歌算法团队研发。 (Repo for YAYI Unified Information Extraction Model)☆304Updated 10 months ago
- 人工精调的中文对话数据集和一段chatglm的微调代码☆1,181Updated last month
- MuCGEC中文纠错数据集及文本纠错SOTA模型开源;Code & Data for our NAACL 2022 Paper "MuCGEC: a Multi-Reference Multi-Source Evaluation Dataset for Chinese Gr…☆548Updated 2 years ago
- 基于开源embedding模型的中文向量效果测试☆143Updated 2 years ago
- 中文自然语言推理与语义相似度数据集☆358Updated 3 years ago
- Tuning LLMs with no tears💦; Sample Design Engineering (SDE) for more efficient downstream-tuning.☆1,005Updated last year
- 使用peft库,对chatGLM-6B/chatGLM2-6B实现4bit的QLoRA高效微调,并做lora model和base model的merge及4bit的量化(quantize)。☆360Updated last year
- ⭐️ NLP Algorithms with transformers lib. Supporting Text-Classification, Text-Generation, Information-Extraction, Text-Matching, RLHF, SF…☆2,350Updated last year
- 🚀 RocketQA, dense retrieval for information retrieval and question answering, including both Chinese and English state-of-the-art models…☆778Updated last year
- An open-source educational chat model from ICALK, East China Normal University. 开源中英教育对话大模型。(通用基座模型,GPU部署,数据清理) 致敬: LLaMA, MOSS, BELLE, Z…☆802Updated last month
- 记录本人整理的一些数据集☆1,050Updated 3 years ago
- 中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package www.jionlp.com☆3,648Updated this week