MoFHeka/LLaMA-Megatron

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/MoFHeka/LLaMA-Megatron)

MoFHeka / LLaMA-Megatron

A LLaMA1/LLaMA12 Megatron implement.

☆28

Alternatives and similar repositories for LLaMA-Megatron

Users that are interested in LLaMA-Megatron are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

genggui001 / Megatron-DeepSpeed-Llama
View on GitHub
☆84Sep 9, 2023Updated 2 years ago
MARIO-Math-Reasoning / MARIO_EVAL
View on GitHub
☆52Mar 5, 2025Updated last year
JhCircle / Less-is-More
View on GitHub
[XLLM@ACL2025] Official Code for "Less is More: Enhancing Structured Multi-Agent Reasoning via Quality-Guided Distillation"
☆22Jul 29, 2025Updated 11 months ago
ypw0102 / BatchEval
View on GitHub
code for ACL2024-main: BatchEval: Towards Human-like Text Evaluation
☆19May 20, 2024Updated 2 years ago
alibaba / Megatron-LLaMA
View on GitHub
Best practice for training LLaMA models in Megatron-LM
☆664Jan 2, 2024Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
epfLLM / Megatron-LLM
View on GitHub
distributed trainer for LLMs
☆589May 20, 2024Updated 2 years ago
Derydoca / factory-auto-registration
View on GitHub
Demonstration of a factory pattern where the types automatically register themselves
☆13Mar 13, 2019Updated 7 years ago
bjoernpl / KOSMOS_reimplementation
View on GitHub
A reimplementation of KOSMOS-1 from "Language Is Not All You Need: Aligning Perception with Language Models"
☆27Mar 3, 2023Updated 3 years ago
ypw0102 / GDR
View on GitHub
code for EACL2024-main:Generative Dense Retrieval: Memory Can Be a Burden
☆32Jan 19, 2024Updated 2 years ago
hcmlab / mobileSSI
View on GitHub
mobile part of the open SSI framework
☆12Sep 5, 2018Updated 7 years ago
xinwuye / LatentChem
View on GitHub
☆50Jun 4, 2026Updated last month
jsksxs360 / event-coref-emnlp2022
View on GitHub
a within-document event coreference resolution system, trained and evaluated on the KBP corpus.
☆10May 15, 2023Updated 3 years ago
bertmaher / tf32_gemm
View on GitHub
Example of binding a TF32 CUTLASS GEMM kernel to PyTorch
☆12Jun 7, 2024Updated 2 years ago
Daphnis-z / nlp-ztools
View on GitHub
本项目包含几种常用 NLP算法的实现：关键词(keyword)、命名实体(named entity)、自动摘要(abstract)、文本相似度比较(text similarity)等
☆16Jan 16, 2022Updated 4 years ago
Open source password manager - Proton Pass • Ad
Securely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
teddysum / korean_evaluation
View on GitHub
☆10Jun 5, 2025Updated last year
RocketFlash / easy_metric_learning
View on GitHub
Just prepare config file and start training your metric learning model with ease
☆16May 20, 2026Updated last month
CLR-Lab / SimKO
View on GitHub
SimKO: Simple Pass@K Policy Optimization
☆31Oct 24, 2025Updated 8 months ago
xiami2019 / CLAIF
View on GitHub
[Findings of ACL'2023] Improving Contrastive Learning of Sentence Embeddings from AI Feedback
☆40Aug 14, 2023Updated 2 years ago
prodeveloper0 / UniquePrintV1
View on GitHub
Synthesizing Fingerprint from Pattern Type Analysis Features using cGAN - WITC 2019
☆12Apr 19, 2019Updated 7 years ago
deepglint / DanQing
View on GitHub
The official repo for the DanQing dataset.
☆36Mar 25, 2026Updated 3 months ago
AI-confused / CGEC-with-Pointer-Generator-Network-Bart
View on GitHub
基于Bart语言模型的指针生成网络，用于中文语法纠错任务
☆16Sep 8, 2022Updated 3 years ago
siyuyuan / coscript
View on GitHub
Resources for our ACL 2023 paper: Distilling Script Knowledge from Large Language Models for Constrained Language Planning
☆36Aug 19, 2023Updated 2 years ago
recursal / GoldFinch-paper
View on GitHub
GoldFinch and other hybrid transformer components
☆46Jul 20, 2024Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
xiaoxing2001 / DeGLA
View on GitHub
[ACM MM25] Official Pytorch implementation of [Decoupled Global-Local Alignment for Improving Compositional Understanding]
☆16Jul 15, 2025Updated 11 months ago
awslabs / optimizing-multitask-training-through-dynamic-pipelines
View on GitHub
Official repository for the paper DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines
☆19Dec 8, 2023Updated 2 years ago
Oneflow-Inc / oneflow_face
View on GitHub
☆12Aug 10, 2022Updated 3 years ago
so8991 / 2D-to-3D-interpolation-project
View on GitHub
I used morphing target animation to implement a system to reconstruct 2D webcam frame images to 3D facial mesh
☆14Mar 7, 2017Updated 9 years ago
KMnP / can
View on GitHub
🤔 When in Doubt: Improving Classification Performance with Alternating Normalization [Findings of EMNLP2021]
☆15Oct 29, 2021Updated 4 years ago
inclusionAI / DR-Venus
View on GitHub
☆92May 8, 2026Updated 2 months ago
anibali / dsnt-pose2d
View on GitHub
2D human pose estimation with DSNT
☆15Oct 18, 2018Updated 7 years ago
yasumasaonoe / ecbd
View on GitHub
☆11Apr 23, 2023Updated 3 years ago
hieudx149 / X-RetroMAE
View on GitHub
Code Roberta version of RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder
☆10Mar 16, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
GeorgeLuImmortal / PaDeLLM_NER
View on GitHub
☆11Nov 21, 2024Updated last year
search-opensource-space / FashionBERT
View on GitHub
☆11Sep 18, 2020Updated 5 years ago
GeorgeLuImmortal / Hierarchical-BERT-Model-with-Limited-Labelled-Data
View on GitHub
☆41Sep 2, 2021Updated 4 years ago
princeton-nlp / continual-factoid-memorization
View on GitHub
Continual Memorization of Factoids in Large Language Models
☆12Nov 20, 2024Updated last year
leimao / Nsight-Compute-Docker-Image
View on GitHub
Nsight Compute In Docker
☆13Dec 21, 2023Updated 2 years ago
llmeval / LLMEval-1
View on GitHub
[AAAI 2024] LLMEval Phase I dataset — 17 categories, 453 questions, 2186 annotators for Chinese LLM evaluation
☆114May 21, 2026Updated last month
janjongboom / alpine-opencv-docker
View on GitHub
Pre-built OpenCV for armhf Alpine Linux 3.6
☆14Nov 20, 2018Updated 7 years ago