huggingface / xet-core
xet client tech, used in huggingface_hub
☆80Updated this week
Alternatives and similar repositories for xet-core:
Users that are interested in xet-core are comparing it to the libraries listed below
- Model Context Protocol Server for Apache OpenDAL™☆27Updated last week
- vLLM adapter for a TGIS-compatible gRPC server.☆26Updated this week
- Super-fast Structured Outputs☆191Updated this week
- This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedback☆79Updated last month
- Rust implementation of Surya☆57Updated last month
- This repository contains statistics about the AI Infrastructure products.☆18Updated last month
- 👷 Build compute kernels☆32Updated this week
- ☆191Updated 2 weeks ago
- TensorRT-LLM server with Structured Outputs (JSON) built with Rust☆49Updated last week
- Rust crates for XetHub☆41Updated 6 months ago
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆68Updated this week
- TRITONCACHE implementation of a Redis cache☆13Updated this week
- A single-binary, GPU-accelerated LLM server (HTTP and WebSocket API) written in Rust☆79Updated last year
- A benchmarking tool for comparing different LLM API providers' DeepSeek model deployments.☆29Updated 3 weeks ago
- parallel fetch☆123Updated this week
- clustering algorithm implementation☆13Updated 2 weeks ago
- ANE accelerated embedding models!☆16Updated 4 months ago
- Benchmark suite for LLMs from Fireworks.ai☆70Updated 2 months ago
- ☆11Updated 2 months ago
- XTR/WARP is an extremely fast and accurate retrieval engine based on Stanford's ColBERTv2/PLAID and Google DeepMind's XTR.☆123Updated 5 months ago
- LLM-as-SERP☆63Updated last month
- Extract core logic from qdrant and make it available as a library.☆57Updated last year
- ☆39Updated 2 years ago
- Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)☆28Updated last year
- python bindings for symphonia/opus - read various audio formats from python and write opus files☆56Updated last week
- Vector Database with support for late interaction and token level embeddings.☆54Updated 6 months ago
- ☆126Updated 11 months ago
- Try out HallOumi, a state-of-the-art claim verification model in a simple UI!☆30Updated 2 weeks ago
- A minimalistic C++ Jinja templating engine for LLM chat templates☆131Updated last week
- The DPAB-α Benchmark☆19Updated 3 months ago