Pretrain CPM-1
☆53Apr 20, 2021Updated 5 years ago
Alternatives and similar repositories for CPM-1-Pretrain
Users that are interested in CPM-1-Pretrain are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A plug-in of Microsoft DeepSpeed to fix the bug of DeepSpeed pipeline☆25Apr 16, 2021Updated 5 years ago
- Finetune CPM-1☆73Mar 18, 2023Updated 3 years ago
- Introduction to CPM☆164Sep 26, 2021Updated 4 years ago
- Code for CPM-2 Pre-Train☆157Mar 18, 2023Updated 3 years ago
- Finetune CPM-2☆80Mar 18, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆37Jan 5, 2021Updated 5 years ago
- 中国法研杯 CAIL 2019☆13Jun 17, 2019Updated 7 years ago
- A concise implementation of SimCSE☆16Aug 2, 2021Updated 4 years ago
- EVA: Large-scale Pre-trained Chit-Chat Models☆304Mar 11, 2023Updated 3 years ago
- The baseline method for CCIR 22 https://www.datafountain.cn/competitions/573☆13Aug 2, 2022Updated 3 years ago
- Inference framework for MoE layers based on TensorRT with Python binding☆41May 31, 2021Updated 5 years ago
- [ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets" by Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, …☆18Dec 30, 2021Updated 4 years ago
- Chinese Pre-Trained Language Models (CPM-LM) Version-I☆1,580Mar 18, 2023Updated 3 years ago
- [ACL'21] Dialogue Response Selection with Hierarchical Curriculum Learning☆21Nov 15, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- An implementation of Maximum Entropy model☆14Apr 28, 2012Updated 14 years ago
- Automatically exported from code.google.com/p/cx-extractor☆14Mar 8, 2016Updated 10 years ago
- karthikbmk's independent study☆10Sep 2, 2017Updated 8 years ago
- ☆54Apr 15, 2022Updated 4 years ago
- Dynamic Entity Summarization (DynES)☆20May 10, 2019Updated 7 years ago
- NTK scaled version of ALiBi position encoding in Transformer.☆69Aug 16, 2023Updated 2 years ago
- Code of the COLING22 paper "uChecker: Masked Pretrained Language Models as Unsupervised Chinese Spelling Checkers"☆19Aug 17, 2022Updated 3 years ago
- ☆15Dec 10, 2021Updated 4 years ago
- TREC Real-Time Summarization Tools☆15Jul 19, 2017Updated 8 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A more efficient GLM implementation!☆54Feb 18, 2023Updated 3 years ago
- ☆52Jan 1, 2024Updated 2 years ago
- [EMNLP 2023] Once Upon a *Time* in *Graph*: Relative-Time Pretraining for Complex Temporal Reasoning☆17Oct 31, 2023Updated 2 years ago
- Example of distributed learning in Julia☆21Jun 28, 2017Updated 8 years ago
- ☆16Nov 19, 2021Updated 4 years ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆69Jul 20, 2023Updated 2 years ago
- The dataset and PyTorch Implementation for ACL 2020 paper "MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Ans…☆43Sep 7, 2020Updated 5 years ago
- The codebase for "Group-wise Contrastive Learning for Neural Dialogue Generation" (Cai et al., Findings of EMNLP 2020)☆55Feb 24, 2021Updated 5 years ago
- An algorithm that intelligently executes a crypto order over time via Coinbase☆13Oct 26, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- RACE is a multi-dimensional benchmark for code generation that focuses on Readability, mAintainability, Correctness, and Efficiency.☆14Oct 12, 2024Updated last year
- 基于Transformer的单模型、多尺度的VAE模型☆57Jun 29, 2021Updated 4 years ago
- Repository for ACL2021 paper: <Zero-shot Event Extraction via Transfer Learning: Challenges and Insights>.☆30Jan 5, 2023Updated 3 years ago
- Conversational Toolkit. An Open-Source Toolkit for Fast Development and Fair Evaluation of Text Generation☆129Aug 31, 2020Updated 5 years ago
- a baseline to practice☆45Jul 6, 2021Updated 4 years ago
- Source code for "Domain-Aware Dialogue State Tracker for Multi-Domain Dialogue Systems"☆10Oct 5, 2020Updated 5 years ago
- ☆13Aug 23, 2017Updated 8 years ago