beichao1314 / Open-LlamaView external linksLinks
The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.
☆67Mar 27, 2023Updated 2 years ago
Alternatives and similar repositories for Open-Llama
Users that are interested in Open-Llama are comparing it to the libraries listed below
Sorting:
- The complete training code of the open-source high-performance Llama model, including the full process from pre-training to RLHF.☆68May 9, 2023Updated 2 years ago
- Karras et al. (2022) diffusion models for PyTorch☆17Oct 5, 2023Updated 2 years ago
- A LLaMA1/LLaMA12 Megatron implement.☆28Dec 13, 2023Updated 2 years ago
- PLM: Efficient Peripheral Language Models Hardware-Co-Designed for Ubiquitous Computing☆21Mar 18, 2025Updated 10 months ago
- Ongoing research training transformer models at scale☆18Jul 27, 2023Updated 2 years ago
- Best practice for training LLaMA models in Megatron-LM☆664Jan 2, 2024Updated 2 years ago
- 1.4B sLLM for Chinese and English - HammerLLM🔨☆43Apr 7, 2024Updated last year
- ☆55Jan 3, 2025Updated last year
- Unofficial implementation of https://arxiv.org/pdf/2407.14679☆53Sep 7, 2024Updated last year
- example for rendering charts with flask & echarts☆19May 17, 2018Updated 7 years ago
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆224Nov 21, 2023Updated 2 years ago
- A Massive Multi-Level Multi-Subject Knowledge Evaluation benchmark☆104Jul 20, 2023Updated 2 years ago
- ☆30May 20, 2022Updated 3 years ago
- Firefly中文LLaMA-2大模型,支持增量预训练Baichuan2、Llama2、Llama、Falcon、Qwen、Baichuan、InternLM、Bloom等大模型☆416Oct 21, 2023Updated 2 years ago
- The Code and Script of "David's Slingshot: A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis"☆35Jun 13, 2025Updated 8 months ago
- how to run DeepSeek-R1-Distill-Qwen-1.5B GGUF locally on your PC☆28Jan 24, 2025Updated last year
- 大语言模型指令调优工具(支持 FlashAttention)☆177Jan 4, 2024Updated 2 years ago
- [AAAI 2026] The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants☆46Dec 11, 2025Updated 2 months ago
- ☆46Sep 27, 2025Updated 4 months ago
- [ACL 2025] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLM…☆68Oct 27, 2024Updated last year
- Y-Agent Studio 是一个面向 企业级应用 的Agent开发套,Y-Agent是其中的核心模块。 包含了:支持智能体编排、RAG、流程日志、单元测试、流程测试、语料生产等垂直领域非常需要的功能。 智能体编排可以在同一个流程中,同时支持多智能体协作和流程混合编排…☆25Oct 4, 2025Updated 4 months ago
- Financial Analysis and Algorithmic Trading Strategies in Python☆11Feb 16, 2023Updated 3 years ago
- [ACL 2024] Progressive LLaMA with Block Expansion.☆514May 20, 2024Updated last year
- Chinese Financial Assistant with Large Language Model☆77Sep 4, 2024Updated last year
- Implementation of Chinese ChatGPT☆288Nov 20, 2023Updated 2 years ago
- ☆280Jul 10, 2023Updated 2 years ago
- 文本去重☆78May 23, 2024Updated last year
- ⭐️ NLP Algorithms with transformers lib. Supporting Text-Classification, Text-Generation, Information-Extraction, Text-Matching, RLHF, SF…☆2,409Sep 29, 2023Updated 2 years ago
- [NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"☆38May 24, 2024Updated last year
- [CVPR2025] VDocRAG: Retirval-Augmented Generation over Visually-Rich Documents☆58May 26, 2025Updated 8 months ago
- 中文 Instruction tuning datasets☆143Apr 10, 2024Updated last year
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆1,434Mar 20, 2024Updated last year
- The dataset and source code for our paper: "Did You Ask a Good Question? A Cross-Domain Question IntentionClassification Benchmark for Te…☆32Jul 5, 2021Updated 4 years ago
- End-to-end integration of HuggingFace's models for sequence labeling.☆11Oct 4, 2020Updated 5 years ago
- Implementation of the model from "Faster sorting algorithms discovered using deep reinforcement learning" that discovered an all-new ult…☆11Aug 29, 2023Updated 2 years ago
- 中文大模型微调(LLM-SFT), 数学指令数据集MWP-Instruct, 支持模型(ChatGLM-6B, LLaMA, Bloom-7B, baichuan-7B), 支持(LoRA, QLoRA, DeepSpeed, UI, TensorboardX), 支持(微…☆216May 17, 2024Updated last year
- Instruction Tuning with GPT-4☆4,340Jun 11, 2023Updated 2 years ago
- ☆147Apr 16, 2024Updated last year
- MEASURING MASSIVE MULTITASK CHINESE UNDERSTANDING☆89Mar 24, 2024Updated last year