受到self-instruct启发,除了通用LLM还能做垂直领域的小LLM实现定制效果,通过GPT获得question和answer来作为训练数据
☆18May 12, 2023Updated 2 years ago
Alternatives and similar repositories for domain-self-instruct
Users that are interested in domain-self-instruct are comparing it to the libraries listed below
Sorting:
- A Structured Grammar for Chart Annotation☆15May 8, 2025Updated 10 months ago
- Accelerating GOT-OCRv2 with VLLM☆11Nov 15, 2024Updated last year
- ☆11Jan 11, 2022Updated 4 years ago
- The official implementation of our work SQLFixAgent: Towards Semantic-Accurate Text-to-SQL Parsing via Consistency-Enhanced Multi-Agent C…☆23May 2, 2025Updated 10 months ago
- 百度UIE抽取模型torch版训练预测框架☆12Nov 20, 2024Updated last year
- Offical code repository for PromptMix: A Class Boundary Augmentation Method for Large Language Model Distillation, EMNLP 2023☆12Dec 13, 2023Updated 2 years ago
- Distilling Task-Specific Knowledge from BERT into Simple Neural Networks.☆15Aug 28, 2020Updated 5 years ago
- 基于ChatGPT构建的中文self-instruct数据集☆119May 16, 2023Updated 2 years ago
- funasr语音转文字的简单api版本,funasr+fastapi,方便部署在服务器上☆13Aug 10, 2024Updated last year
- 采用bert进行事件抽取,[cls]进行事件分类,最后一层向量进行序列标注,两个任务同时训练。☆13Jun 7, 2021Updated 4 years ago
- chinese NLP dataset☆17Nov 6, 2020Updated 5 years ago
- Code for KDD 2025 paper "FreRA: A Frequency-Refined Augmentation for Contrastive Learning on Time Series Classification"☆32Jun 20, 2025Updated 8 months ago
- MAG-SQL: Multi-Agent Generative Approach with Soft Schema Linking and Iterative Sub-SQL Refinement for Text-to-SQL☆18Jul 10, 2025Updated 7 months ago
- 适用于ChatGLM微调的数据集生成器, 支持多轮对话☆15Jul 22, 2023Updated 2 years ago
- An easy-to-use library and command-line tool for TTS☆15May 3, 2025Updated 10 months ago
- huggingface ChineseBert Tokenizer☆16Apr 16, 2022Updated 3 years ago
- Agentic RAG for open domain text-to-query☆16Aug 28, 2025Updated 6 months ago
- [ACL 2024 Findings] Learning Fine-Grained Grounded Citations for Attributed Large Language Models☆20Oct 24, 2024Updated last year
- ☆24Oct 14, 2024Updated last year
- Code for RECENT☆13Dec 18, 2022Updated 3 years ago
- 复现论文《Distilling Task-Specific Knowledge from BERT into Simple Neural Networks》☆16Jun 13, 2021Updated 4 years ago
- ☆17Mar 24, 2023Updated 2 years ago
- 基于simcse的中文句向量生成☆16Jun 8, 2022Updated 3 years ago
- Source code for AAAI 2021 paper "A Supervised Multi-Head Self-Attention Network for Nested Named Entity Recognition""☆16Jun 16, 2021Updated 4 years ago
- accelerate generating vector by using onnx model☆18Jan 23, 2024Updated 2 years ago
- 2021搜狐校园文本匹配算法大赛☆16Jun 4, 2021Updated 4 years ago
- ☆23Apr 22, 2025Updated 10 months ago
- Generate dialog data from documents using LLM like ChatGLM2 or ChatGPT;利用ChatGLM2,ChatGPT等大模型根据文档生成对话数据集☆164Oct 25, 2023Updated 2 years ago
- FastAPI Server Implementation for Bilibili Index TTS☆25Apr 13, 2025Updated 10 months ago
- [ACL 25] SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities☆28Apr 2, 2025Updated 11 months ago
- 问答摘要/seq2seq/PGN/Bert_sum/UniLM☆19Oct 4, 2020Updated 5 years ago
- Excellent open source network agent tools☆21Nov 20, 2023Updated 2 years ago
- ☆22May 22, 2024Updated last year
- 基于外挂知识库的大模型问答☆24Mar 6, 2024Updated 2 years ago
- [EMNLP 2024] The official GitHub repo for the paper "Course-Correction: Safety Alignment Using Synthetic Preferences"☆20Oct 2, 2024Updated last year
- ☆22Sep 11, 2025Updated 5 months ago
- Instruction Tuning data generation uses LLM in a specific scenario.☆23May 2, 2024Updated last year
- TextEmbed is a REST API crafted for high-throughput and low-latency embedding inference. It accommodates a wide variety of embedding mode…☆28Sep 5, 2024Updated last year
- 基于Pytorch + BERT的抽取式机器阅读理解☆21Dec 8, 2022Updated 3 years ago