Mythos-Rudy / mnbvc-fasttext-classification
this repo is mnbvc text quality classification using fastText
☆14Updated 11 months ago
Related projects: ⓘ
- 文本去重☆65Updated 3 months ago
- 本项目旨在对大量文本文件进行快速编码检测和转换以辅助mnbvc语料集项目的数据清洗工作☆52Updated 3 weeks ago
- 用于微调LLM的中文指令数据集☆27Updated last year
- 百川Dynamic NTK-ALiBi的代码实现:无需微调即可推理更长文本☆45Updated last year
- The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.☆58Updated last year
- Source code for ACL 2023 paper Decoder Tuning: Efficient Language Understanding as Decoding☆47Updated last year
- 通用版面分析 | 中文文档解析 |Document Layout Analysis | layout paser☆41Updated 3 months ago
- Silk Road will be the dataset zoo for Luotuo(骆驼). Luotuo is an open sourced Chinese-LLM project founded by 陈启源 @ 华中师范大学 & 李鲁鲁 @ 商汤科技 & 冷子…☆38Updated 10 months ago
- 一套代码指令微调大模型☆36Updated last year
- (NBCE)Naive Bayes-based Context Extension on ChatGLM-6b☆14Updated last year
- 中文原生检索增强生成测评基准☆92Updated 5 months ago
- ☆23Updated this week
- 大语言模型训练和服务调研☆32Updated last year
- ☆124Updated 2 months ago
- ChatGLM2-6B微调, SFT/LoRA, instruction finetune☆107Updated last year
- 基于 LoRA 和 P-Tuning v2 的 ChatGLM-6B 高效参数微调☆54Updated last year
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆19Updated last year
- ☆58Updated last year
- the newest version of llama3,source code explained line by line using Chinese☆21Updated 5 months ago
- Clustering and Ranking: Diversity-preserved Instruction Selection through Expert-aligned Quality Estimation☆61Updated 4 months ago
- TianGong-AI-Unstructure☆48Updated last week
- 怎么训练一个LLM分词器☆123Updated last year
- NTK scaled version of ALiBi position encoding in Transformer.☆64Updated last year
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆37Updated 6 months ago
- code for Scaling Laws of RoPE-based Extrapolation☆68Updated 11 months ago
- LLM+RAG for QA☆19Updated 8 months ago
- 使用qlora对中文大语言模型进行微调,包含ChatGLM、Chinese-LLaMA-Alpaca、BELLE☆86Updated last year
- Code & Data for our Paper "NaSGEC: Multi-Domain Chinese Grammatical Error Correction for Native Speaker Texts" (ACL 2023 Findings)☆73Updated last year
- 1.4B sLLM for Chinese and English - HammerLLM🔨☆43Updated 5 months ago
- 演示 vllm 对中文大语言模型的神奇效果☆31Updated 10 months ago