Reduce the size of pretrained Hugging Face models via vocabulary trimming.
☆49Dec 28, 2022Updated 3 years ago
Alternatives and similar repositories for hf-trim
Users that are interested in hf-trim are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Pipeline for easy fine-tuning of BERT architecture for sequence classification☆23Jul 21, 2023Updated 2 years ago
- 多语言降噪预训练模型MBart的中文生成任务☆11May 27, 2021Updated 4 years ago
- REST API for sentence tokenization and embedding using Multilingual Universal Sentence Encoder.☆51Sep 5, 2021Updated 4 years ago
- Unofficial implementation of QaNER: Prompting Question Answering Models for Few-shot Named Entity Recognition.☆64Oct 15, 2022Updated 3 years ago
- TVRecap: A Dataset for Generating Stories with Character Descriptions☆21Jun 5, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Pretraining scripts for BART transformer model☆12May 15, 2023Updated 3 years ago
- Winning Solution for the M5 Competition for Uncertainty Forecasting☆10May 25, 2023Updated 2 years ago
- Pipeline for training Stanford Seq2Seq Neural Machine Translation using PyTorch.☆12Jan 17, 2021Updated 5 years ago
- Unofficial implementation of paper "InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER" (https://arxiv.…☆38Feb 14, 2024Updated 2 years ago
- ☆11Aug 2, 2022Updated 3 years ago
- Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Lo…☆41Jan 17, 2024Updated 2 years ago
- Generating artificial disfluencies from fluent text easily and promptly☆15Sep 28, 2022Updated 3 years ago
- Code for extracting parallel corpora from pmindia☆17Jan 28, 2020Updated 6 years ago
- A chess engine designed to fit into 4kb☆12Updated this week
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆15Apr 12, 2021Updated 5 years ago
- Self-Supervised Document-to-Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference☆44Nov 28, 2022Updated 3 years ago
- PyTorch reimplementation of the paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization"☆16Oct 17, 2021Updated 4 years ago
- Bi-encoder entity linking architecture☆52Sep 10, 2024Updated last year
- This repo contains the code and instructions for our paper : Evaluating the Impact of Knowledge Graph Context on Entity Disambiguation Mo…☆26Oct 24, 2020Updated 5 years ago
- UzTransliterator | State-of-the-art machine transliteration tool for Uzbek language☆13Jan 6, 2026Updated 4 months ago
- ☆50Sep 6, 2025Updated 8 months ago
- Finetune Malaysian LLM for Malaysian context embedding task.☆23Apr 27, 2024Updated 2 years ago
- An opinionated NLP research template☆10Aug 29, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Documentation and background of sign language processing☆147May 10, 2026Updated last week
- LLM checkpointing for DeepSpeed/Megatron☆25Nov 30, 2025Updated 5 months ago
- Lyrics and Vocal Melody Generation conditioned on Accompaniment☆28Aug 27, 2022Updated 3 years ago
- ☆13Jan 17, 2024Updated 2 years ago
- Python source code for EMNLP 2021 Findings paper: "Subword Mapping and Anchoring Across Languages".☆13Sep 17, 2021Updated 4 years ago
- ML Reproducibility Challenge 2020: Electra reimplementation using PyTorch and Transformers☆12Apr 16, 2021Updated 5 years ago
- This is an implementation of the audio source separation model as well as the evaluation metrics proposed in the paper "Weakly Informed A…☆12Nov 26, 2019Updated 6 years ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆64Jul 29, 2024Updated last year
- A Smalltalk Web Browser for Squeak/Smalltalk☆18Apr 18, 2022Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆13Jul 10, 2020Updated 5 years ago
- Gradient accumulation on tf.estimator☆12Dec 15, 2020Updated 5 years ago
- Universal Semantic Annotator (LREC 2022)☆18Jan 29, 2025Updated last year
- ⚡ boost inference speed of T5 models by 5x & reduce the model size by 3x.☆588Apr 24, 2023Updated 3 years ago
- Khoá học Python for Data Analysis dành cho các bạn mới bắt đầu☆26Feb 25, 2022Updated 4 years ago
- Using data from IBM Watson, descriptive and predictive analytics using Python and tableau☆12Dec 23, 2017Updated 8 years ago
- 🚀🤗 A collection of templates for Hugging Face Spaces☆34Oct 9, 2023Updated 2 years ago