This is the codebase for pre-training, compressing, extending, and distilling LLMs with Megatron-LM.
☆12Mar 11, 2024Updated 2 years ago
Alternatives and similar repositories for DongwuLLM
Users that are interested in DongwuLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- OpenBA-V2: 3B LLM (Large Language Model) with T5 architecture, utilizing model pruning technique and continuing pretraining from OpenBA-1…☆25May 10, 2024Updated 2 years ago
- The framework to prune LLMs to any size and any config.☆95Mar 1, 2024Updated 2 years ago
- CMD: a framework for Context-aware Model self-Detoxification (EMNLP2024 Long Paper)☆17Feb 10, 2025Updated last year
- Code for paper: Long cOntext aliGnment via efficient preference Optimization☆24Oct 10, 2025Updated 7 months ago
- ☆30May 24, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- The aim of this repository is to utilize LLaMA to reproduce and enhance the Stanford Alpaca☆98Apr 5, 2023Updated 3 years ago
- [ACL 2024 Findings] Code implementation of Paper "Rethinking Negative Instances for Generative Named Entity Recognition"☆60Mar 20, 2024Updated 2 years ago
- Evaluating the faithfulness of long-context language models☆30Oct 21, 2024Updated last year
- ☆36Oct 14, 2022Updated 3 years ago
- Diffusion Model Improvement Method☆35Sep 4, 2023Updated 2 years ago
- Code for Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language Model, IJCAI 2020☆12Nov 26, 2020Updated 5 years ago
- [NeurIPS 2021] Duplex Sequence-to-Sequence Learning for Reversible Machine Translation☆15Jun 7, 2022Updated 3 years ago
- Using conversational games to evaluate powerful LLMs☆18Sep 3, 2023Updated 2 years ago
- ☆12Jun 13, 2025Updated 11 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code and data for "Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change" (EMNLP2022)☆18Dec 8, 2022Updated 3 years ago
- A Survey of Neural Dialogue Systems☆19Dec 31, 2021Updated 4 years ago
- Applies ROME and MEMIT on Mamba-S4 models☆15Apr 5, 2024Updated 2 years ago
- Code and data for "Timo: Towards Better Temporal Reasoning for Language Models" (COLM 2024)☆26Oct 23, 2024Updated last year
- Long Context Research☆32Jan 26, 2026Updated 4 months ago
- A static website for a Chatbot with Azure OpenAI, Azure Text to Speech Services and Live2D☆13Sep 4, 2024Updated last year
- Go bindings for LLama.cpp☆14Apr 11, 2023Updated 3 years ago
- pubg_sdk☆11Jul 26, 2020Updated 5 years ago
- ☆97Dec 6, 2024Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- VChat - 基于itchat-uos完全重构的微信个人号接口☆48Jun 8, 2025Updated 11 months ago
- Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)☆32Jul 3, 2024Updated last year
- Automatically exported from code.google.com/p/hf-2011☆15Feb 12, 2016Updated 10 years ago
- Implementation of latent-GLAT (ACL-2022)☆34Apr 30, 2022Updated 4 years ago
- ☆11Dec 19, 2023Updated 2 years ago
- Topic models for microblogging content☆10Sep 23, 2015Updated 10 years ago
- MDClub 的 JavaScript 版 SDK☆12May 29, 2022Updated 4 years ago
- A comprehensive and efficient long-context model evaluation framework☆31Feb 25, 2026Updated 3 months ago
- Repository of shared bibtex files (references)☆11Apr 29, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆56Feb 28, 2023Updated 3 years ago
- Referring expression comprehension on ReferIt(RefClef)☆10Nov 28, 2016Updated 9 years ago
- 收集整理于网络,常见敏感词!☆13Jan 14, 2024Updated 2 years ago
- smart chinese LLm☆19Jan 31, 2024Updated 2 years ago
- ☆13Sep 5, 2021Updated 4 years ago
- 今日头条搜索引擎以及新闻详情页爬虫(Selenium)☆15Mar 13, 2025Updated last year
- DeepSearch - Advanced Web Dir Scanner☆15Nov 13, 2018Updated 7 years ago