ππ€ A collection of templates for Hugging Face Spaces
β35Oct 9, 2023Updated 2 years ago
Alternatives and similar repositories for spaces-docker-templates
Users that are interested in spaces-docker-templates are comparing it to the libraries listed below
Sorting:
- Scripts to convert datasets from various sources to Hugging Face Datasets.β57Oct 26, 2022Updated 3 years ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/β¦β28Apr 17, 2024Updated last year
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"β28Oct 3, 2021Updated 4 years ago
- GitHub action that'll sync files from a GitHub Repo with the Hugging Face Hub π€β79Oct 30, 2024Updated last year
- User-friendly viewer for Parquet filesβ10Jan 10, 2026Updated last month
- High-performance, asynchronous Python HTTP client library designed for faster file transfers using concurrency, semaphores, and fault-tolβ¦β59May 12, 2025Updated 9 months ago
- My NER Experiments with ModernBERT and Ettinβ26Jul 17, 2025Updated 7 months ago
- A fun (yet toxic) twitter bot that uses GPT-3 to either roast π or toast π₯ a tweet if you mention it in the repliesβ31Jan 14, 2023Updated 3 years ago
- Code for SaGe subword tokenizer (EACL 2023)β27Nov 30, 2024Updated last year
- HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]β14Jul 11, 2023Updated 2 years ago
- Python Module implementing SRPβ12Jul 29, 2022Updated 3 years ago
- Execute arbitrary SQL queries on π€ Datasetsβ32Jan 24, 2024Updated 2 years ago
- π Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignmentβ11Apr 6, 2025Updated 11 months ago
- T-Projection is a method to perform high-quality Annotation Projection of Sequence Labeling datasets.β13Nov 21, 2023Updated 2 years ago
- β12Dec 6, 2024Updated last year
- ANE accelerated embedding models!β20Dec 11, 2024Updated last year
- β16Aug 10, 2022Updated 3 years ago
- Official implementation of "Data Mixture Inference: What do BPE tokenizers reveal about their training data?"β18May 15, 2025Updated 9 months ago
- β16Dec 14, 2022Updated 3 years ago
- Resources related to EMNLP 2021 paper "FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations"β13Dec 14, 2021Updated 4 years ago
- Command Line Interface for running π€ Transformers Image Classification locallyβ19May 8, 2025Updated 9 months ago
- The training codes of Jasper-Token-Compression-600Mβ19Nov 19, 2025Updated 3 months ago
- KIND: an Italian Multi-Domain Dataset for Named Entity Recognitionβ15Jun 28, 2023Updated 2 years ago
- The offcial repository for 'CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos', SIGIβ¦β16May 4, 2022Updated 3 years ago
- Drop in replacement for OpenAI, but with Open models.β157May 11, 2023Updated 2 years ago
- DImensionality REduction in JAXβ25Nov 21, 2025Updated 3 months ago
- LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)β18May 10, 2023Updated 2 years ago
- Hugging Face and Pyserini interoperabilityβ19May 18, 2023Updated 2 years ago
- Use Actions to acquire those precious lambda GPUsβ19Sep 7, 2023Updated 2 years ago
- A Framework aims to wisely initialize unseen subword embeddings in PLMs for efficient large-scale continued pretrainingβ18Nov 26, 2023Updated 2 years ago
- Official code for the paper: "Metadata Archaeology"β19May 10, 2023Updated 2 years ago
- Discord TL;DR, a bot that allows you to summarize conversations on any Discord channel in several languages.β14Jan 17, 2023Updated 3 years ago
- Learning to Model Editing Processesβ26Aug 3, 2025Updated 7 months ago
- Code for ACL 2023 Paper: ACLM: A Selective-Denoising based Generative Data Augmentation Approach for Low-Resource Complex NERβ21Jul 19, 2023Updated 2 years ago
- Code for AAAI 2023 Paper : βAlignment-Enriched Tuning for Patch-Level Pre-trained Document Image Modelsββ18Dec 6, 2022Updated 3 years ago
- Temporary remove unused tokens during training to save ram and speed.β23Jun 15, 2025Updated 8 months ago
- Overview of corpora/datasets for Germanic low-resource languages and dialects. Accompanies "A Survey of Corpora for Germanic Low-Resourceβ¦β26Feb 16, 2026Updated 2 weeks ago
- BPE modification that implements removing of the intermediate tokens during tokenizer training.β26Nov 25, 2024Updated last year
- π¦ Serving Platform for Spatial AI and Robotics.β23Jun 19, 2025Updated 8 months ago