seanzhang-zhichen / baichuan-Dynamic-NTK-ALiBi
百川Dynamic NTK-ALiBi的代码实现:无需微调即可推理更长文本
☆46Updated last year
Related projects ⓘ
Alternatives and complementary repositories for baichuan-Dynamic-NTK-ALiBi
- NTK scaled version of ALiBi position encoding in Transformer.☆67Updated last year
- 零样本学习测评基准,中文版☆54Updated 3 years ago
- Finetune Bloom big language model with Lora method☆28Updated last year
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆19Updated last year
- SuperCLUE-Math6:新一代中文原生多轮多步数学推理数据集的探索之旅☆46Updated 9 months ago
- “悟道”数据☆39Updated 3 years ago
- ☆43Updated 11 months ago
- moss chat finetuning☆50Updated 7 months ago
- 文本去重☆67Updated 6 months ago
- 中文 Instruction tuning datasets☆118Updated 7 months ago
- ☆82Updated last year
- NLU & NLG (zero-shot) depend on mengzi-t5-base-mt pretrained model☆75Updated 2 years ago
- A more efficient GLM implementation!☆55Updated last year
- The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.☆61Updated last year
- 使用qlora对中文大语言模型进行微调,包含ChatGLM、Chinese-LLaMA-Alpaca、BELLE☆85Updated last year
- ChatGLM2-6B微调, SFT/LoRA, instruction finetune☆106Updated last year
- (NBCE)Naive Bayes-based Context Extension on ChatGLM-6b☆14Updated last year
- code for Scaling Laws of RoPE-based Extrapolation☆70Updated last year
- ☆93Updated 8 months ago
- BLOOM 模型的指令微调☆24Updated last year
- Source code for ACL 2023 paper Decoder Tuning: Efficient Language Understanding as Decoding☆48Updated last year
- OPD: Chinese Open-Domain Pre-trained Dialogue Model☆74Updated last year
- chatglm_rlhf_finetuning☆27Updated last year
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆69Updated last year
- 基于 LoRA 和 P-Tuning v2 的 ChatGLM-6B 高效参数微调☆54Updated last year
- GoGPT:基于Llama/Llama 2训练的中英文增强大模型|Chinese-Llama2☆78Updated last year
- 怎么训练一个LLM分词器☆130Updated last year
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆52Updated 7 months ago
- deep training task☆29Updated last year
- MEASURING MASSIVE MULTITASK CHINESE UNDERSTANDING☆87Updated 8 months ago