CodingMonkey12 / Semantic-Search-using-PaddleLinks
基于Paddle进行语义检索并部署上线,支持多语言 This code is based on Paddle to do a semantic search, and deploy it. Multilingual support
☆12Updated 2 years ago
Alternatives and similar repositories for Semantic-Search-using-Paddle
Users that are interested in Semantic-Search-using-Paddle are comparing it to the libraries listed below
Sorting:
- 通用版面分析 | 中文文档解析 |Document Layout Analysis | layout paser☆46Updated last year
- bge推理优化相关脚本☆28Updated last year
- 文档方向分类☆219Updated 7 months ago
- 基于sentence-transformers实现文本转向量的机器人☆46Updated 2 years ago
- 时间抽取、解析、标准化工具☆53Updated 2 years ago
- Graph QABot Demo| 图谱问答案例☆15Updated 2 years ago
- 文本纠错(Text Correct, CSC), 支持中文文本纠错(拼写纠错/标点符号纠错/繁体纠错)(CSC, Chinese Spelling Correct / Check; Punct), CSC支持各领域数据的中文文本纠错(包括古文), 重点是错别字检测纠正(字词…☆31Updated 2 weeks ago
- This repository provides an implementation of "A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction B…☆71Updated last week
- Based on RapidOCR, extract the PDF content☆174Updated 2 months ago
- Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SO…☆90Updated 5 months ago
- Unsupervised tableQA and databaseQA on chinese finance question and tabular data☆12Updated 2 years ago
- 在kaggle部署ChatGLM API,和ChatGPT api使用相同的调用方式☆14Updated 2 years ago
- PaddleOCR 输出结果的行对齐,表格制式图像OCR行对齐☆45Updated 3 years ago
- 🌳CED: Catalog Extraction from Documents☆16Updated last year
- 该项目是为了使用layoutlmv3针对中文图片训练和推理。 其中主要解决三个问题: 1.数据标准化成可以的训练数据集格式 2.layoutlmv3-base-chinese 分词修改 2.超过512长度的文本切分和滑窗操作☆52Updated 10 months ago
- 一站式自动化开源标注平台☆74Updated 2 years ago
- ☆28Updated 9 months ago
- FinCUGE Instruction dataset☆12Updated 2 years ago
- Python implementation of AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, w…☆46Updated 3 months ago
- 如需体验textin文档解析,请点击https://cc.co/16YSIy☆111Updated 2 weeks ago
- 智能文本自动处理工具(Intelligent text automatic processing tool)。AutoText的功能主要有文本纠错,图片ocr、版面检测以及表格结构识别等。The main functions of this project include …☆25Updated 2 years ago
- 视觉信息抽取任务中,使用OCR识别结果规范 多模态大模型的回答☆37Updated 6 months ago
- 地址标准化☆121Updated last year
- FinanceEventGraph,金融领域事件图谱开放数据集,可用于事件图谱搭建于实验,包括3865个acquire并购事件、9093个invest投资事件,总计12960的事件☆20Updated last year
- Let ChatGPT (Large Language Models) Serve As Data Annotator and Zero-shot/few-shot Information Extractor.☆32Updated 2 years ago
- 供AI训练的中文数据集(持续更新。。。)与AI公司图谱,目前的数据集餐饮行业8000问,百度知道,Alpaca中文数据集,计算机领域数据集,Vicuna数据集,RedPajama数据集,Wikipedia中文词条数据集,网站论坛问答数据集☆58Updated last year
- company name parser, extract company name brand. 中文公司名称分词工具,支持公司名称中的地名,品牌名(主词),行业词,公司名后缀提取。☆91Updated 2 years ago
- 这里将paddle中的ocr等模型转为onnx格式,并利用java版深度框架djl加载这些onnx模型进行推理预测尝试。☆13Updated 2 years ago
- 🌈 NERpy: Implementation of Named Entity Recognition using Python. 命名实体识别工具,支持BertSoftmax、BertSpan等模型,开箱即用。☆115Updated last year
- 中文新词发现算法PNW算法,可以识别任意长度的新词。☆15Updated 2 years ago