Pretrain CPM-1
☆53Apr 20, 2021Updated 4 years ago
Alternatives and similar repositories for CPM-1-Pretrain
Users that are interested in CPM-1-Pretrain are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A plug-in of Microsoft DeepSpeed to fix the bug of DeepSpeed pipeline☆25Apr 16, 2021Updated 4 years ago
- Distill CPM-1☆18May 6, 2021Updated 4 years ago
- Finetune CPM-1☆74Mar 18, 2023Updated 3 years ago
- Introduction to CPM☆165Sep 26, 2021Updated 4 years ago
- benchmarks for evaluating MT models☆11Jun 26, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Code for CPM-2 Pre-Train☆157Mar 18, 2023Updated 3 years ago
- Finetune CPM-2☆82Mar 18, 2023Updated 3 years ago
- 仓库主要记录 NLP 算法工程师相关的顶会论文研读笔记【文本匹配篇】☆13Jul 9, 2022Updated 3 years ago
- ☆37Jan 5, 2021Updated 5 years ago
- A concise implementation of SimCSE☆16Aug 2, 2021Updated 4 years ago
- EVA: Large-scale Pre-trained Chit-Chat Models☆305Mar 11, 2023Updated 3 years ago
- The official implementation of the paper "Self-Updatable Large Language Models by Integrating Context into Model Parameters"☆15May 18, 2025Updated 10 months ago
- [ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, …☆18Dec 30, 2021Updated 4 years ago
- Chinese Pre-Trained Language Models (CPM-LM) Version-I☆1,580Mar 18, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Stochastic Gradient Markov Chain Monte Carlo and Optimisation☆17Mar 21, 2017Updated 9 years ago
- Code for our SIGIR'2017 paper "Neural Rating Regression with Abstractive Tips Generation for Recommendation"☆14Jul 24, 2020Updated 5 years ago
- [ACL'21] Dialogue Response Selection with Hierarchical Curriculum Learning☆21Nov 15, 2022Updated 3 years ago
- An implementation of Maximum Entropy model☆14Apr 28, 2012Updated 13 years ago
- Automatically exported from code.google.com/p/cx-extractor☆14Mar 8, 2016Updated 10 years ago
- karthikbmk's independent study☆10Sep 2, 2017Updated 8 years ago
- Code of the COLING22 paper "uChecker: Masked Pretrained Language Models as Unsupervised Chinese Spelling Checkers"☆19Aug 17, 2022Updated 3 years ago
- NTK scaled version of ALiBi position encoding in Transformer.☆69Aug 16, 2023Updated 2 years ago
- ☆17Jul 5, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- [EMNLP 2023] Once Upon a *Time* in *Graph*: Relative-Time Pretraining for Complex Temporal Reasoning☆17Oct 31, 2023Updated 2 years ago
- Example of distributed learning in Julia☆22Jun 28, 2017Updated 8 years ago
- Efficient Inference for Big Models☆586Jan 24, 2023Updated 3 years ago
- ☆15Nov 19, 2021Updated 4 years ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆69Jul 20, 2023Updated 2 years ago
- The codebase for "Group-wise Contrastive Learning for Neural Dialogue Generation" (Cai et al., Findings of EMNLP 2020)☆55Feb 24, 2021Updated 5 years ago
- The dataset and PyTorch Implementation for ACL 2020 paper "MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Ans…☆43Sep 7, 2020Updated 5 years ago
- RACE is a multi-dimensional benchmark for code generation that focuses on Readability, mAintainability, Correctness, and Efficiency.☆12Oct 12, 2024Updated last year
- An algorithm that intelligently executes a crypto order over time via Coinbase☆13Oct 26, 2021Updated 4 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- 基于Transformer的单模型、多尺度的VAE模型☆58Jun 29, 2021Updated 4 years ago
- Repository for ACL2021 paper: <Zero-shot Event Extraction via Transfer Learning: Challenges and Insights>.☆30Jan 5, 2023Updated 3 years ago
- Conversational Toolkit. An Open-Source Toolkit for Fast Development and Fair Evaluation of Text Generation☆129Aug 31, 2020Updated 5 years ago
- a baseline to practice☆45Jul 6, 2021Updated 4 years ago
- ☆38Aug 5, 2024Updated last year
- Repository for DISRPT2019 shared task☆12Sep 5, 2022Updated 3 years ago
- LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training☆406Jul 31, 2025Updated 7 months ago