huggingface / doc-builder
The package used to build the documentation of our Hugging Face repos
☆110Updated last week
Alternatives and similar repositories for doc-builder:
Users that are interested in doc-builder are comparing it to the libraries listed below
- ☆122Updated 5 months ago
- Datasets collection and preprocessings framework for NLP extreme multitask learning☆178Updated 3 months ago
- Pipeline for pulling and processing online language model pretraining data from the web☆177Updated last year
- [WIP] A 🔥 interface for running code in the cloud☆85Updated 2 years ago
- Google TPU optimizations for transformers models☆107Updated 2 months ago
- **ARCHIVED** Filesystem interface to 🤗 Hub☆58Updated 2 years ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆254Updated 9 months ago
- ☆67Updated 2 years ago
- ☆169Updated 2 months ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆93Updated 2 years ago
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆65Updated 2 years ago
- some common Huggingface transformers in maximal update parametrization (µP)☆80Updated 3 years ago
- Let's build better datasets, together!☆257Updated 3 months ago
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆81Updated last year
- ☆199Updated last year
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆115Updated 2 years ago
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- ☆19Updated 2 years ago
- experiments with inference on llama☆104Updated 10 months ago
- A library for squeakily cleaning and filtering language datasets.☆46Updated last year
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆128Updated 3 months ago
- Command Line Interface for Hugging Face Inference Endpoints☆66Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimization☆59Updated last year
- minimal pytorch implementation of bm25 (with sparse tensors)☆100Updated last year
- Minimal sharded dataset loaders, decoders, and utils for multi-modal document, image, and text datasets.☆157Updated last year
- Code for Zero-Shot Tokenizer Transfer☆127Updated 3 months ago
- Experiments for efforts to train a new and improved t5☆77Updated 11 months ago
- Supercharge huggingface transformers with model parallelism.☆76Updated 6 months ago
- Scripts to convert datasets from various sources to Hugging Face Datasets.☆57Updated 2 years ago
- Experiments with generating opensource language model assistants☆97Updated last year