This is the codebase for pre-training, compressing, extending, and distilling LLMs with Megatron-LM.
☆12Mar 11, 2024Updated 2 years ago
Alternatives and similar repositories for DongwuLLM
Users that are interested in DongwuLLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- OpenBA-V2: 3B LLM (Large Language Model) with T5 architecture, utilizing model pruning technique and continuing pretraining from OpenBA-1…☆25May 10, 2024Updated last year
- The framework to prune LLMs to any size and any config.☆96Mar 1, 2024Updated 2 years ago
- CMD: a framework for Context-aware Model self-Detoxification (EMNLP2024 Long Paper)☆17Feb 10, 2025Updated last year
- Code for paper: Long cOntext aliGnment via efficient preference Optimization☆24Oct 10, 2025Updated 6 months ago
- ☆28May 24, 2025Updated 10 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- The aim of this repository is to utilize LLaMA to reproduce and enhance the Stanford Alpaca☆98Apr 5, 2023Updated 3 years ago
- [ACL 2024 Findings] Code implementation of Paper "Rethinking Negative Instances for Generative Named Entity Recognition"☆60Mar 20, 2024Updated 2 years ago
- Evaluating the faithfulness of long-context language models☆30Oct 21, 2024Updated last year
- ☆36Oct 14, 2022Updated 3 years ago
- Diffusion Model Improvement Method☆35Sep 4, 2023Updated 2 years ago
- Code for Unsupervised Domain Adaptation of a Pretrained Cross-Lingual Language Model, IJCAI 2020☆12Nov 26, 2020Updated 5 years ago
- [NeurIPS 2021] Duplex Sequence-to-Sequence Learning for Reversible Machine Translation☆15Jun 7, 2022Updated 3 years ago
- Using conversational games to evaluate powerful LLMs☆18Sep 3, 2023Updated 2 years ago
- Code and data for "Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change" (EMNLP2022)☆18Dec 8, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A Survey of Neural Dialogue Systems☆19Dec 31, 2021Updated 4 years ago
- Applies ROME and MEMIT on Mamba-S4 models☆14Apr 5, 2024Updated 2 years ago
- Code and data for "Timo: Towards Better Temporal Reasoning for Language Models" (COLM 2024)☆25Oct 23, 2024Updated last year
- Long Context Research☆31Jan 26, 2026Updated 2 months ago
- Go bindings for LLama.cpp☆14Apr 11, 2023Updated 3 years ago
- A static website for a Chatbot with Azure OpenAI, Azure Text to Speech Services and Live2D☆13Sep 4, 2024Updated last year
- pubg_sdk☆11Jul 26, 2020Updated 5 years ago
- ☆96Dec 6, 2024Updated last year
- VChat - 基于itchat-uos完全重构的微信个人号接口☆48Jun 8, 2025Updated 10 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Code and data for "Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?" (ACL 2024)☆32Jul 3, 2024Updated last year
- Automatically exported from code.google.com/p/hf-2011☆15Feb 12, 2016Updated 10 years ago
- Implementation of latent-GLAT (ACL-2022)☆34Apr 30, 2022Updated 3 years ago
- ☆11Dec 19, 2023Updated 2 years ago
- Topic models for microblogging content☆10Sep 23, 2015Updated 10 years ago
- MDClub 的 JavaScript 版 SDK☆12May 29, 2022Updated 3 years ago
- A comprehensive and efficient long-context model evaluation framework☆31Feb 25, 2026Updated last month
- Repository of shared bibtex files (references)☆11Apr 12, 2026Updated last week
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆56Feb 28, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Referring expression comprehension on ReferIt(RefClef)☆10Nov 28, 2016Updated 9 years ago
- 收集整理于网络,常见敏感词!☆13Jan 14, 2024Updated 2 years ago
- 简单的 AIGC 微服务,可通过 HTTP、gRPC 连接,支持流式回答。☆10Mar 23, 2023Updated 3 years ago
- smart chinese LLm☆19Jan 31, 2024Updated 2 years ago
- ☆13Sep 5, 2021Updated 4 years ago
- DeepSearch - Advanced Web Dir Scanner☆15Nov 13, 2018Updated 7 years ago
- 今日头条搜索引擎以及新闻详情页爬虫(Selenium)☆15Mar 13, 2025Updated last year