Mythos-Rudy / mnbvc-fasttext-classification
this repo is mnbvc text quality classification using fastText
☆15Updated last year
Alternatives and similar repositories for mnbvc-fasttext-classification:
Users that are interested in mnbvc-fasttext-classification are comparing it to the libraries listed below
- 大语言模型训练和服务调研☆35Updated last year
- 文本去重☆68Updated 8 months ago
- 本项目旨在对大量文本文件进行快速编码检测和转换以辅助mnbvc语料集项目的数据清洗工作☆56Updated 3 months ago
- 通用版面分析 | 中文文档解析 |Document Layout Analysis | layout paser☆45Updated 7 months ago
- ☆25Updated 3 months ago
- This repository provides an implementation of the paper "A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Co…☆52Updated 3 weeks ago
- Imitate OpenAI with Local Models☆85Updated 5 months ago
- ☆137Updated 6 months ago
- The Level-Navi Agent, a framework that requires no training and utilizes large language models for deep query understanding and precise s…☆11Updated last month
- The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.☆64Updated last year
- 本项目用于大模型数学解题能力方面的数据集合成,模型训练及评测,相关文章记录。☆67Updated 4 months ago
- 中文原生检索增强生成测评基准☆107Updated 9 months ago
- 百川Dynamic NTK-ALiBi的代码实现:无需微调即可推理更长文本☆47Updated last year
- Silk Road will be the dataset zoo for Luotuo(骆驼). Luotuo is an open sourced Chinese-LLM project founded by 陈启源 @ 华中师范大学 & 李鲁鲁 @ 商汤科技 & 冷子…☆38Updated last year
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆38Updated 10 months ago
- the newest version of llama3,source code explained line by line using Chinese☆22Updated 9 months ago
- Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SO…☆55Updated last week
- 国内首个全参数训练的法律大模型 HanFei-1.0 (韩非)☆112Updated last year
- aigc evals☆10Updated last year
- ☆36Updated 4 months ago
- TianGong-AI-Unstructure☆56Updated this week
- ☆62Updated 4 months ago
- 用于微调LLM的中文指令数据集☆27Updated last year
- 基于baichuan-7b的开源多模态大语言模型☆73Updated last year
- SearchGPT: Building a quick conversation-based search engine with LLMs.☆44Updated 3 weeks ago
- ☆45Updated 7 months ago
- Qwen-WisdomVast is a large model trained on 1 million high-quality Chinese multi-turn SFT data, 200,000 English multi-turn SFT data, and …☆18Updated 9 months ago
- 百度QA100万数据集☆47Updated last year
- 演示 vllm 对中文大语言模型的神奇效果☆31Updated last year
- 专注于中文领域大语言模型,落地到某个行业某个领域,成为一个行业大模型、公司级别或行业级别领域大模型。☆115Updated 4 months ago