Scripts to train a bidirectional LSTM with knowledge distillation from BERT
☆159Nov 21, 2019Updated 6 years ago
Alternatives and similar repositories for distil-bilstm
Users that are interested in distil-bilstm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Distilling BERT using natural language generation.☆39Aug 13, 2023Updated 2 years ago
- BERT distillation(基于BERT的蒸馏实验 )☆317Jul 30, 2020Updated 5 years ago
- Reference implementation of the paper "Word Embeddings for Entity-annotated Texts"☆18Apr 12, 2019Updated 7 years ago
- ☆61Nov 14, 2019Updated 6 years ago
- Implementation for NATv2.☆23Feb 20, 2021Updated 5 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Distilling Task-Specific Knowledge from BERT into Simple Neural Networks.☆15Aug 28, 2020Updated 5 years ago
- Knowledge Distillation For Transformer Language Models☆54Jan 3, 2024Updated 2 years ago
- some tutorials for blog: simonjisu.github.io☆23Mar 25, 2021Updated 5 years ago
- ☆34Jul 16, 2019Updated 6 years ago
- ☆65Apr 8, 2020Updated 6 years ago
- Source code repo for paper "TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation"☆10Aug 11, 2023Updated 2 years ago
- KoParadigm: Korean Inflectional Paradigm Generator☆59Nov 23, 2022Updated 3 years ago
- "Learning Discrete and Continuous Factors of Data via Alternating Disentanglement" accepted at ICML2019☆22Aug 22, 2019Updated 6 years ago
- ☆11Aug 12, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- A transformer model that should be able to solve a simple NER task☆11Mar 7, 2019Updated 7 years ago
- Preprocessing Library for Natural Language Processing☆164Dec 6, 2022Updated 3 years ago
- Examples of cleaning up raw voices☆18Mar 2, 2022Updated 4 years ago
- ☆14May 15, 2020Updated 6 years ago
- Minimal code to train ELMo models in recent versions of TensorFlow☆14Jun 16, 2026Updated last week
- Google's TPGST reimplementation.☆34Dec 11, 2019Updated 6 years ago
- ☆12Nov 25, 2018Updated 7 years ago
- An annotated Chinese dataset for RE (Relation Extraction) task.☆14Oct 18, 2018Updated 7 years ago
- Code for the AAAI 2020 paper "Keyphrase Generation for Scientific Articles using GANs"☆60Dec 8, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Proposed a model architecture which learns to classify duplicate question pairs based on highly contextualized sentence representations. …☆15Dec 8, 2022Updated 3 years ago
- Transformer training code for sequential tasks☆610Sep 14, 2021Updated 4 years ago
- Code for bidirectional sequence generation (BiSon) for generating from BERT pre-trained models.☆51Mar 17, 2020Updated 6 years ago
- ☆13Mar 27, 2020Updated 6 years ago
- A list of awesome machine question answering dataset - 機器問答數據集☆15Dec 24, 2019Updated 6 years ago
- The code for the Subformer, from the EMNLP 2021 Findings paper: "Subformer: Exploring Weight Sharing for Parameter Efficiency in Generati…☆16Sep 1, 2021Updated 4 years ago
- Based on https://github.com/fatchord/WaveRNN☆24May 3, 2020Updated 6 years ago
- Code for papers "A Surprisingly Robust Trick for Winograd Schema Challenge" and "WikiCREM: A Large Unsupervised Corpus for Coreference Re…☆71Oct 4, 2022Updated 3 years ago
- ICASSP 2020 ESPnet-TTS: Merlin baseline system☆36Oct 28, 2019Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A tensorflow implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis☆20Oct 23, 2019Updated 6 years ago
- pytorch implementation for Patient Knowledge Distillation for BERT Model Compression☆203Sep 20, 2019Updated 6 years ago
- Learning-Recurrent-Binary-Ternary-Weights☆13Dec 4, 2018Updated 7 years ago
- Deep Multi-Speech model☆11Jul 25, 2018Updated 7 years ago
- ☆14Mar 21, 2020Updated 6 years ago
- Longformer: The Long-Document Transformer☆2,196Feb 8, 2023Updated 3 years ago
- WIP Tensorflow implementation of https://github.com/mozilla/TTS☆15Apr 11, 2020Updated 6 years ago