This is the codebase for pre-training, compressing, extending, and distilling LLMs with Megatron-LM.
☆12Mar 11, 2024Updated 2 years ago
Alternatives and similar repositories for DongwuLLM
Users that are interested in DongwuLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- OpenBA-V2: 3B LLM (Large Language Model) with T5 architecture, utilizing model pruning technique and continuing pretraining from OpenBA-1…☆25May 10, 2024Updated last year
- The framework to prune LLMs to any size and any config.☆95Mar 1, 2024Updated 2 years ago
- CMD: a framework for Context-aware Model self-Detoxification (EMNLP2024 Long Paper)☆17Feb 10, 2025Updated last year
- Code for paper: Long cOntext aliGnment via efficient preference Optimization☆24Oct 10, 2025Updated 5 months ago
- ☆28May 24, 2025Updated 10 months ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- The aim of this repository is to utilize LLaMA to reproduce and enhance the Stanford Alpaca☆98Apr 5, 2023Updated 2 years ago
- [ACL 2024 Findings] Code implementation of Paper "Rethinking Negative Instances for Generative Named Entity Recognition"☆60Mar 20, 2024Updated 2 years ago
- Evaluating the faithfulness of long-context language models☆30Oct 21, 2024Updated last year
- ☆36Oct 14, 2022Updated 3 years ago
- Diffusion Model Improvement Method☆35Sep 4, 2023Updated 2 years ago
- Code for Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language Model, IJCAI 2020☆12Nov 26, 2020Updated 5 years ago
- [NeurIPS 2021] Duplex Sequence-to-Sequence Learning for Reversible Machine Translation☆15Jun 7, 2022Updated 3 years ago
- Using conversational games to evaluate powerful LLMs☆18Sep 3, 2023Updated 2 years ago
- Code and data for "Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change" (EMNLP2022)☆18Dec 8, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- A Survey of Neural Dialogue Systems☆19Dec 31, 2021Updated 4 years ago
- Applies ROME and MEMIT on Mamba-S4 models☆14Apr 5, 2024Updated last year
- Code and data for "Timo: Towards Better Temporal Reasoning for Language Models" (COLM 2024)☆25Oct 23, 2024Updated last year
- Long Context Research☆31Jan 26, 2026Updated 2 months ago
- Go bindings for LLama.cpp☆14Apr 11, 2023Updated 2 years ago
- A static website for a Chatbot with Azure OpenAI, Azure Text to Speech Services and Live2D☆13Sep 4, 2024Updated last year
- pubg_sdk☆11Jul 26, 2020Updated 5 years ago
- ☆96Dec 6, 2024Updated last year
- VChat - 基于itchat-uos完全重构的微信个人号接口☆48Jun 8, 2025Updated 9 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)☆32Jul 3, 2024Updated last year
- Automatically exported from code.google.com/p/hf-2011☆15Feb 12, 2016Updated 10 years ago
- Implementation of latent-GLAT (ACL-2022)☆34Apr 30, 2022Updated 3 years ago
- ☆11Dec 19, 2023Updated 2 years ago
- Topic models for microblogging content☆10Sep 23, 2015Updated 10 years ago
- MDClub 的 JavaScript 版 SDK☆12May 29, 2022Updated 3 years ago
- A comprehensive and efficient long-context model evaluation framework☆31Feb 25, 2026Updated last month
- Repository of shared bibtex files (references)☆11Mar 9, 2026Updated 2 weeks ago
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆56Feb 28, 2023Updated 3 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Referring expression comprehension on ReferIt(RefClef)☆10Nov 28, 2016Updated 9 years ago
- DeepSearch - Advanced Web Dir Scanner☆14Nov 13, 2018Updated 7 years ago
- 收集整理于网络,常见敏感词!☆13Jan 14, 2024Updated 2 years ago
- 简单的 AIGC 微服务,可通过 HTTP、gRPC 连接,支持流式回 答。☆10Mar 23, 2023Updated 3 years ago
- smart chinese LLm☆19Jan 31, 2024Updated 2 years ago
- ☆13Sep 5, 2021Updated 4 years ago
- 今日头条搜索引擎以及新闻详情页爬虫(Selenium)☆15Mar 13, 2025Updated last year