lhoestq / hfjobsLinks
Hugging Face Jobs
☆19Updated last month
Alternatives and similar repositories for hfjobs
Users that are interested in hfjobs are comparing it to the libraries listed below
Sorting:
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆143Updated last month
- ☆124Updated 9 months ago
- QLoRA with Enhanced Multi GPU Support☆37Updated 2 years ago
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆83Updated this week
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆27Updated last year
- Let's build better datasets, together!☆261Updated 8 months ago
- An introduction to LLM Sampling☆79Updated 8 months ago
- Load compute kernels from the Hub☆244Updated this week
- 🤝 Trade any tensors over the network☆30Updated last year
- ☆134Updated this week
- A toolkit implementing advanced methods to transfer models and model knowledge across tokenizers.☆43Updated last month
- ☆68Updated last month
- Google TPU optimizations for transformers models☆118Updated 7 months ago
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆107Updated 5 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆33Updated 3 months ago
- ☆134Updated last year
- ☆49Updated 6 months ago
- ☆19Updated 2 years ago
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆63Updated 2 weeks ago
- Fast, Modern, and Low Precision PyTorch Optimizers☆108Updated 3 weeks ago
- ☆49Updated last year
- Manage scalable open LLM inference endpoints in Slurm clusters☆270Updated last year
- minimal pytorch implementation of bm25 (with sparse tensors)☆104Updated last year
- ☆15Updated last year
- ☆51Updated 6 months ago
- Crispy reranking models by Mixedbread☆34Updated last month
- Collection of autoregressive model implementation☆86Updated 4 months ago
- 👷 Build compute kernels☆106Updated last week
- Code for Zero-Shot Tokenizer Transfer☆135Updated 7 months ago
- experiments with inference on llama☆104Updated last year