WalkerMitty / Fast-Llama2Links
Fast instruction tuning with Llama2
☆11Updated last year
Alternatives and similar repositories for Fast-Llama2
Users that are interested in Fast-Llama2 are comparing it to the libraries listed below
Sorting:
- 1.4B sLLM for Chinese and English - HammerLLM🔨☆44Updated last year
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆57Updated last year
- OpenLLMDE: An open source data engineering framework for LLMs☆17Updated last year
- 官方transformers源码解析。AI大模型时代,pytorch、transformer是新操作系统,其他都是运行在其上面的软件。☆17Updated last year
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆39Updated last year
- the newest version of llama3,source code explained line by line using Chinese☆22Updated last year
- Copy the MLP of llama3 8 times as 8 experts , created a router with random initialization,add load balancing loss to construct an 8x8b Mo…☆27Updated 11 months ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆32Updated last year
- Qwen-WisdomVast is a large model trained on 1 million high-quality Chinese multi-turn SFT data, 200,000 English multi-turn SFT data, and …☆18Updated last year
- 百川Dynamic NTK-ALiBi的代码实现:无需微调即可推理更长文本☆47Updated last year
- Official completion of “Training on the Benchmark Is Not All You Need”.☆34Updated 5 months ago
- 大语言模型训练和服务调研☆37Updated last year
- Deepseek-r1复现科普与资源汇总☆21Updated 3 months ago
- code for Scaling Laws of RoPE-based Extrapolation☆73Updated last year
- ROUGE for multilingual Summarization☆25Updated 3 years ago
- SuperCLUE-Math6:新一代中文原生多轮多步数学推理数据集的探索之旅☆56Updated last year
- A Multi-Format Transfer Learning Model for Event Argument Extraction via Variational Information Bottleneck☆10Updated 2 years ago
- Qwen1.5-SFT(阿里, Ali), Qwen_Qwen1.5-2B-Chat/Qwen_Qwen1.5-7B-Chat微调(transformers)/LORA(peft)/推理☆63Updated last year
- [ICLR'24 spotlight] Tool-Augmented Reward Modeling☆50Updated 3 weeks ago
- ☆53Updated last week
- [ACL 2025] An official pytorch implement of the paper: Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement☆30Updated 3 weeks ago
- Pretrain、decay、SFT a CodeLLM from scratch 🧙♂️☆36Updated last year
- [ICML2025] The official implementation of "C-3PO: Compact Plug-and-Play Proxy Optimization to Achieve Human-like Retrieval-Augmented Gene…☆23Updated last month
- ☆46Updated this week
- Official Implementation of APB (ACL 2025 main)☆28Updated 4 months ago
- ☆36Updated 9 months ago
- 中文原生等级化代码能力测试基准☆13Updated last year
- Code for preprint "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"☆39Updated last month
- ☆14Updated last year
- 逻辑回归和单层softmax的解析解☆12Updated 3 years ago