llama2 finetuning with deepspeed and lora
☆176Jul 28, 2023Updated 2 years ago
Alternatives and similar repositories for llama2-lora-fine-tuning
Users that are interested in llama2-lora-fine-tuning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Llama2 chinese finetuning☆38Aug 2, 2023Updated 2 years ago
- The code of SKS☆15Mar 22, 2022Updated 4 years ago
- [Information Systems-2024] The official implemention of ACMR (Bert4XMR).☆11Sep 22, 2024Updated last year
- 简单易懂的LLaMA微调指南。☆413Jul 5, 2023Updated 2 years ago
- Implementation of SATA Tree-LSTM (Dynamic Compositionality in Recursive Neural Networks with Structure-aware Tag Representations, AAAI 20…☆10Jun 21, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- MSTI☆16Mar 6, 2024Updated 2 years ago
- Documentation at☆14Mar 27, 2025Updated last year
- Code for COLING 2022 paper "FactMix: Using a Few Labeled In-domain Examples to Generalize to Cross-domain Named Entity Recognition"☆15Jan 15, 2023Updated 3 years ago
- Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用☆14,713Apr 6, 2025Updated last year
- ☆85May 2, 2026Updated last month
- Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.☆97Feb 5, 2024Updated 2 years ago
- Chinese Word Segmentation task based on BERT and implemented in Pytorch☆14Aug 14, 2020Updated 5 years ago
- Universal information extraction with instruction learning☆398Feb 28, 2025Updated last year
- [CVPR 2026] FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection☆36Jun 7, 2026Updated last week
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents☆41Apr 13, 2026Updated 2 months ago
- ☆43Dec 15, 2023Updated 2 years ago
- Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)☆165Oct 30, 2024Updated last year
- Llama2-SFT, Llama-2-7B微调(transformers)/LORA(peft)/推理☆27Jul 26, 2023Updated 2 years ago
- Code for the paper "A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis"☆20Jun 12, 2025Updated last year
- QuoteSum is a textual QA dataset containing Semi-Extractive Multi-source Question Answering (SEMQA) examples written by humans, based on …☆13Mar 25, 2024Updated 2 years ago
- Source code for the paper "A Medical Semantic-Assisted Transformer for Radiographic Report Generation"☆25Jun 23, 2023Updated 2 years ago
- Large language Model fintuning bloom , opt , gpt, gpt2 ,llama,llama-2,cpmant and so on☆99Apr 24, 2024Updated 2 years ago
- bert语言模型校验句子的通顺性☆15Aug 17, 2020Updated 5 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official Repository for "Modeling Hierarchical Structures with Continuous Recursive Neural Networks" (ICML 2021)☆12Aug 18, 2021Updated 4 years ago
- rag base on langchain☆11Mar 1, 2024Updated 2 years ago
- AI-WordCards is an innovative project that leverages the power of GPT, StableDiffusion, and DALL-E3 to create educational and engaging wo…☆11May 16, 2024Updated 2 years ago
- 怎么训练一个LLM分词器☆152Jul 13, 2023Updated 2 years ago
- [EMNLP 2023] ALCUNA: Large Language Models Meet New Knowledge☆30Oct 30, 2023Updated 2 years ago
- ☆29Apr 30, 2024Updated 2 years ago
- Repository for "Attribute First, then Generate: Locally-attributable Grounded Text Generation", ACL 2024☆30Dec 19, 2024Updated last year
- Source code of paper "Alirector: Alignment-Enhanced Chinese Grammatical Error Corrector" (Findings of ACL 2024)☆14Mar 19, 2025Updated last year
- 基于PyTorch GPT-2的针对各种数据并行pretrain的研究代码.☆11Dec 16, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- a multimodal retrieval dataset☆25Jul 8, 2023Updated 2 years ago
- Code for our EMNLP 2020 paper "Uncertainty-Aware Label Refinement for Sequence Labeling"☆22Oct 4, 2020Updated 5 years ago
- Source code for ACL 2022 paper "Self-contrastive Decorrelation for Sentence Embeddings".☆26Mar 10, 2025Updated last year
- 中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)☆7,137Apr 19, 2026Updated 2 months ago
- [ICLR 2025] SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction☆90Mar 23, 2025Updated last year
- ☆15Aug 4, 2025Updated 10 months ago
- Handling long-running processes (like ML model predictions) inside a Flask app using Celery.☆12Jan 13, 2021Updated 5 years ago