dwzq-com-cn/DongwuLLM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/dwzq-com-cn/DongwuLLM)

dwzq-com-cn / DongwuLLM

This is the codebase for pre-training, compressing, extending, and distilling LLMs with Megatron-LM.

☆12

Alternatives and similar repositories for DongwuLLM

Users that are interested in DongwuLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OpenNLG / OpenBA-v2
View on GitHub
OpenBA-V2: 3B LLM (Large Language Model) with T5 architecture, utilizing model pruning technique and continuing pretraining from OpenBA-1…
☆25May 10, 2024Updated 2 years ago
jordddan / Pruning-LLMs
View on GitHub
The framework to prune LLMs to any size and any config.
☆94Mar 1, 2024Updated 2 years ago
Jikai0Wang / OPT-Tree
View on GitHub
☆30May 24, 2025Updated last year
ZetangForward / CMD-Context-aware-Model-self-Detoxification
View on GitHub
CMD: a framework for Context-aware Model self-Detoxification (EMNLP2024 Long Paper)
☆17Feb 10, 2025Updated last year
dropreg / efficient_alpaca
View on GitHub
The aim of this repository is to utilize LLaMA to reproduce and enhance the Stanford Alpaca
☆98Apr 5, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
yyDing1 / GNER
View on GitHub
[ACL 2024 Findings] Code implementation of Paper "Rethinking Negative Instances for Generative Named Entity Recognition"
☆60Mar 20, 2024Updated 2 years ago
noise-learning / SelfMix
View on GitHub
☆36Oct 14, 2022Updated 3 years ago
LCM-Lab / L-CITEEVAL
View on GitHub
Evaluating the faithfulness of long-context language models
☆30Oct 21, 2024Updated last year
LCM-Lab / Bridge_Gap_Diffusion
View on GitHub
Diffusion Model Improvement Method
☆35Sep 4, 2023Updated 2 years ago
LCM-Lab / LOGO
View on GitHub
Code for paper: Long cOntext aliGnment via efficient preference Optimization
☆26Oct 10, 2025Updated 9 months ago
lijuntaopku / UFD
View on GitHub
Code for Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language Model, IJCAI 2020
☆12Nov 26, 2020Updated 5 years ago
zhengzx-nlp / REDER
View on GitHub
[NeurIPS 2021] Duplex Sequence-to-Sequence Learning for Reversible Machine Translation
☆15Jun 7, 2022Updated 4 years ago
jordddan / GameEval
View on GitHub
Using conversational games to evaluate powerful LLMs
☆18Sep 3, 2023Updated 2 years ago
zhaochen0110 / LMLM
View on GitHub
Code and data for "Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change" (EMNLP2022)
☆17Dec 8, 2022Updated 3 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
qtli / Papers-on-Dialogue-System
View on GitHub
A Survey of Neural Dialogue Systems
☆19Dec 31, 2021Updated 4 years ago
Leey21 / CipherBank
View on GitHub
☆14Jun 13, 2025Updated last year
zhaochen0110 / Timo
View on GitHub
Code and data for "Timo: Towards Better Temporal Reasoning for Language Models" (COLM 2024)
☆26Oct 23, 2024Updated last year
arnab-api / romba
View on GitHub
Applies ROME and MEMIT on Mamba-S4 models
☆16Apr 5, 2024Updated 2 years ago
matthewrennie / go-llama.cpp
View on GitHub
Go bindings for LLama.cpp
☆14Apr 11, 2023Updated 3 years ago
Unkn0wnH4ck3r / pubg_sdk
View on GitHub
pubg_sdk
☆11Jul 26, 2020Updated 6 years ago
Bui1dMySea / MemLong
View on GitHub
☆96Dec 6, 2024Updated last year
zhaochen0110 / Cotempqa
View on GitHub
Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)
☆31Jul 3, 2024Updated 2 years ago
z2z63 / VChat
View on GitHub
VChat - 基于itchat-uos完全重构的微信个人号接口
☆48Jun 8, 2025Updated last year
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
baoy-nlp / Latent-GLAT
View on GitHub
Implementation of latent-GLAT (ACL-2022)
☆34Apr 30, 2022Updated 4 years ago
wjcsharp / hf-2011
View on GitHub
Automatically exported from code.google.com/p/hf-2011
☆15Feb 12, 2016Updated 10 years ago
smutahoang / ttm
View on GitHub
Topic models for microblogging content
☆10Sep 23, 2015Updated 10 years ago
VITA-Group / Random-MoE-as-Dropout
View on GitHub
[ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…
☆56Feb 28, 2023Updated 3 years ago
iridia-ulb / references
View on GitHub
Repository of shared bibtex files (references)
☆11Jul 17, 2026Updated last week
zdhxiong / mdclub-sdk-js
View on GitHub
MDClub 的 JavaScript 版 SDK
☆12May 29, 2022Updated 4 years ago
ruotianluo / refexp-comprehension
View on GitHub
Referring expression comprehension on ReferIt(RefClef)
☆10Nov 28, 2016Updated 9 years ago
6vision / sensitive_words
View on GitHub
收集整理于网络，常见敏感词！
☆13Jan 14, 2024Updated 2 years ago
sjcfr / ege-RoBERTa
View on GitHub
☆13Sep 5, 2021Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
ganeshjawahar / dl4nlp-made-easy
View on GitHub
Toy codes to kick-start deep learning for NLP !
☆12Aug 22, 2016Updated 9 years ago
prosecurity / DeepSearch
View on GitHub
DeepSearch - Advanced Web Dir Scanner
☆18Nov 13, 2018Updated 7 years ago
treepanel / treepanel
View on GitHub
AGPL licensed Octotree fork
☆13Dec 22, 2020Updated 5 years ago
TPLink32 / logistic-regression
View on GitHub
对逻辑回归各种用法的总结,包括线性,多类,并行,分布式,在线,优化方案
☆15Jul 14, 2017Updated 9 years ago
snapwire-media / arion
View on GitHub
Fast thumbnail creation and image metadata extraction
☆19Mar 14, 2019Updated 7 years ago
ProjectNUWA / LayoutNUWA
View on GitHub
☆152Jan 31, 2024Updated 2 years ago
dangxingyu / rnn-icrag
View on GitHub
Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"
☆27Apr 17, 2024Updated 2 years ago