A LLaMA1/LLaMA12 Megatron implement.
☆28Dec 13, 2023Updated 2 years ago
Alternatives and similar repositories for LLaMA-Megatron
Users that are interested in LLaMA-Megatron are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆84Sep 9, 2023Updated 2 years ago
- ☆52Mar 5, 2025Updated last year
- ☆21Sep 5, 2023Updated 2 years ago
- code for ACL2024-main: BatchEval: Towards Human-like Text Evaluation☆19May 20, 2024Updated last year
- Best practice for training LLaMA models in Megatron-LM☆663Jan 2, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- distributed trainer for LLMs☆589May 20, 2024Updated last year
- A reimplementation of KOSMOS-1 from "Language Is Not All You Need: Aligning Perception with Language Models"☆27Mar 3, 2023Updated 3 years ago
- code for EACL2024-main:Generative Dense Retrieval: Memory Can Be a Burden☆32Jan 19, 2024Updated 2 years ago
- MMLU eval for RU/EN☆16Jul 31, 2023Updated 2 years ago
- rule matcher (context free grammar)☆10Dec 27, 2019Updated 6 years ago
- mobile part of the open SSI framework☆12Sep 5, 2018Updated 7 years ago
- a within-document event coreference resolution system, trained and evaluated on the KBP corpus.☆10May 15, 2023Updated 2 years ago
- Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"☆12Mar 25, 2025Updated last year
- 本项目包含几种常用 NLP算法的实现:关键词(keyword)、命名实体(named entity)、自动摘要(abstract)、文本相似度比较(text similarity)等☆16Jan 16, 2022Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Just prepare config file and start training your metric learning model with ease☆16Apr 2, 2024Updated 2 years ago
- PyTorch implementation of the Reinforced Mnemonic Reader + Answer Verifier model (https://arxiv.org/abs/1808.05759)☆10Nov 23, 2018Updated 7 years ago
- Resources for our ACL 2023 paper: Distilling Script Knowledge from Large Language Models for Constrained Language Planning☆36Aug 19, 2023Updated 2 years ago
- GoldFinch and other hybrid transformer components☆46Jul 20, 2024Updated last year
- ☆14Sep 30, 2021Updated 4 years ago
- [ACM MM25] Official Pytorch implementation of [Decoupled Global-Local Alignment for Improving Compositional Understanding]☆16Jul 15, 2025Updated 9 months ago
- Intelligent Resource Requirement Estimation and Scheduling for Deep Learning Jobs on Distributed GPU Clusters☆15Nov 18, 2021Updated 4 years ago
- ☆14Mar 5, 2023Updated 3 years ago
- ☆12Jun 7, 2019Updated 6 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Official repository for the paper DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines☆19Dec 8, 2023Updated 2 years ago
- ☆13Dec 29, 2017Updated 8 years ago
- Co-training for Policy Learning☆13Aug 8, 2019Updated 6 years ago
- 🤔 When in Doubt: Improving Classification Performance with Alternating Normalization [Findings of EMNLP2021]☆14Oct 29, 2021Updated 4 years ago
- ☆11Apr 23, 2023Updated 2 years ago
- To pioneer training long-context multi-modal transformer models☆74Aug 8, 2025Updated 8 months ago
- ☆11Nov 21, 2024Updated last year
- ☆41Sep 2, 2021Updated 4 years ago
- Continual Memorization of Factoids in Large Language Models☆12Nov 20, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 中文大语言模型评测第一期☆113Oct 23, 2023Updated 2 years ago
- Pre-built OpenCV for armhf Alpine Linux 3.6☆14Nov 20, 2018Updated 7 years ago
- ☆33Jun 5, 2025Updated 10 months ago
- The OlymMATH dataset☆24Jun 1, 2025Updated 10 months ago
- This repository contains source code for the PASTA model, a pre-trained language model for table-based fact verification.☆18Dec 27, 2022Updated 3 years ago
- ICML 2025 Spotlight, PCEvolve: Private Contrastive Evolution for Synthetic Dataset Generation via Few-Shot Private Data and Generative AP…☆14Jun 27, 2025Updated 9 months ago
- Emotion Model based Face Animation☆14Jul 23, 2023Updated 2 years ago