huggingface / xet-coreLinks
xet client tech, used in huggingface_hub
☆372Updated 2 weeks ago
Alternatives and similar repositories for xet-core
Users that are interested in xet-core are comparing it to the libraries listed below
Sorting:
- Rust crates for XetHub☆75Updated last year
- This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang☆96Updated this week
- ☆533Updated 2 months ago
- A minimalistic C++ Jinja templating engine for LLM chat templates☆202Updated 3 months ago
- Super-fast Structured Outputs☆648Updated last month
- Module, Model, and Tensor Serialization/Deserialization☆283Updated 4 months ago
- ☆275Updated this week
- PyTorch Single Controller☆939Updated this week
- High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datas…☆224Updated this week
- 👷 Build compute kernels☆198Updated 2 weeks ago
- TensorRT-LLM server with Structured Outputs (JSON) built with Rust☆65Updated 8 months ago
- Inference server benchmarking tool☆135Updated 3 months ago
- Verify Precision of all Kimi K2 API Vendor☆491Updated this week
- ☆43Updated this week
- vLLM adapter for a TGIS-compatible gRPC server.☆47Updated this week
- Simple high-throughput inference library☆155Updated 7 months ago
- Where GPUs get cooked 👩🍳🔥☆345Updated 3 months ago
- PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP☆141Updated 3 months ago
- Faster structured generation☆266Updated 3 weeks ago
- Fast block-level file diffs (e.g. for VM disk images) using CoW filesystem metadata☆246Updated 6 months ago
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆465Updated 2 weeks ago
- Real-time terminal monitor for InfiniBand networks - htop for high-speed interconnects☆131Updated last week
- Unified storage framework for the entire machine learning lifecycle☆155Updated last year
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆398Updated 6 months ago
- Transformer GPU VRAM estimator☆67Updated last year
- Simple & Scalable Pretraining for Neural Architecture Research☆306Updated last month
- This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedback☆114Updated 10 months ago
- High-performance safetensors model loader☆90Updated 3 weeks ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆62Updated 3 months ago
- ☆31Updated 8 months ago