huggingface / xet-coreLinks
xet client tech, used in huggingface_hub
☆403Updated this week
Alternatives and similar repositories for xet-core
Users that are interested in xet-core are comparing it to the libraries listed below
Sorting:
- Rust crates for XetHub☆78Updated last year
- Module, Model, and Tensor Serialization/Deserialization☆287Updated this week
- This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang☆100Updated this week
- ☆541Updated 4 months ago
- Super-fast Structured Outputs☆679Updated last week
- ☆280Updated last week
- PyTorch Single Controller☆967Updated this week
- TensorRT-LLM server with Structured Outputs (JSON) built with Rust☆66Updated 9 months ago
- ☆44Updated this week
- A minimalistic C++ Jinja templating engine for LLM chat templates☆203Updated 4 months ago
- High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datas…☆231Updated this week
- Real-time terminal monitor for InfiniBand networks - htop for high-speed interconnects☆137Updated last month
- Where GPUs get cooked 👩🍳🔥☆363Updated 3 weeks ago
- Official Python API client library for turbopuffer☆103Updated this week
- Faster structured generation☆275Updated 2 weeks ago
- Inference server benchmarking tool☆142Updated 4 months ago
- 👷 Build compute kernels☆215Updated 2 weeks ago
- Fault tolerance for PyTorch (HSDP, LocalSGD, DiLoCo, Streaming DiLoCo)☆475Updated last week
- Self-hosted huggingface mirror service. 自建huggingface镜像服务。☆212Updated 6 months ago
- Transformer GPU VRAM estimator☆68Updated last year
- Utils for Unsloth https://github.com/unslothai/unsloth☆191Updated this week
- vLLM adapter for a TGIS-compatible gRPC server.☆51Updated this week
- Self-host LLMs with vLLM and BentoML☆168Updated 3 weeks ago
- ☆140Updated last year
- JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs wel…☆404Updated last month
- This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedback☆114Updated 11 months ago
- Verify Precision of all Kimi K2 API Vendor☆507Updated 2 weeks ago
- ClearML Fractional GPU - Run multiple containers on the same GPU with driver level memory limitation ✨ and compute time-slicing☆88Updated 2 months ago
- Embeddable library or single binary for indexing and searching 1B vectors☆366Updated last month
- ☆135Updated last year