ialacol / text-inference-batcherView external linksLinks
A high performance batching router optimises max throughput for text inference workload
β16Sep 6, 2023Updated 2 years ago
Alternatives and similar repositories for text-inference-batcher
Users that are interested in text-inference-batcher are comparing it to the libraries listed below
Sorting:
- π LLM inference optimization simulator, modeling compute-bound prefill and memory-bound decode phases.β13Jul 12, 2025Updated 7 months ago
- Graph model execution API for Candleβ17Jul 27, 2025Updated 6 months ago
- Using modal.com to process FineWeb-edu dataβ20Apr 5, 2025Updated 10 months ago
- εΊδΊ CUDA Driver API η cuda θΏθ‘ζΆη―ε’β15Jul 30, 2025Updated 6 months ago
- This library supports evaluating disparities in generated image quality, diversity, and consistency between geographic regions.β20Jun 3, 2024Updated last year
- B-Llama3o a llama3 with Vision Audio and Audio understanding as well as text and Audio and Animation Data output.β26Jun 3, 2024Updated last year
- A curated list of awesome papers about utilizing large language models for ranking.β31Oct 30, 2024Updated last year
- PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculationβ32Nov 16, 2024Updated last year
- the small distributed language model toolkit; fine-tune state-of-the-art LLMs anywhere, rapidlyβ32Oct 19, 2024Updated last year
- fine-tuning tutorialβ17Dec 13, 2025Updated 2 months ago
- DOS Program Developmentβ12Nov 9, 2022Updated 3 years ago
- Training code for Sparse Autoencoders on Embedding modelsβ39Feb 27, 2025Updated 11 months ago
- utilities for loading and running text embeddings with onnxβ45Aug 16, 2025Updated 5 months ago
- A domain-specific language (DSL) based on Triton but providing higher-level abstractions.β41Feb 4, 2026Updated last week
- MEXMA: Token-level objectives improve sentence representationsβ42Jan 6, 2025Updated last year
- Protocol buffers and other common resources.β13Jan 20, 2026Updated 3 weeks ago
- β10Jan 9, 2024Updated 2 years ago
- A complete(grpc service and lib) Rust inference with multilingual embedding support. This version leverages the power of Rust for both GRβ¦β39Aug 20, 2024Updated last year
- LightGBM for handling label-imbalanced data with focal and weighted loss functions in binary and multiclass classificationβ21Jan 29, 2026Updated 2 weeks ago
- Redis distributed lock implementation for Python based on Pub/Sub messagingβ11Nov 15, 2025Updated 2 months ago
- β14Dec 12, 2022Updated 3 years ago
- ChatGPT CSS styleβ14Apr 28, 2024Updated last year
- β11Dec 6, 2023Updated 2 years ago
- This project is based on the [LTX-Video](https://github.com/Lightricks/LTX-Video) algorithm of the diffusers and optimized and accelerateβ¦β11Dec 31, 2024Updated last year
- run embeddings in MLXβ97Sep 27, 2024Updated last year
- This project auto-instruments containerized workloads in Kubernetes with New Relic agents.β12Updated this week
- Tritonβstyle kernel toolkit for MLX plus a small upstream incubator: prototype, benchmark, and upstream fusions for Apple Siliconβ35Updated this week
- πΉ Instruct.KR 2025 Summer Meetup: μ€νμμ€ LLM, vLLMμΌλ‘ ProductionκΉμ§ πΉβ24Aug 2, 2025Updated 6 months ago
- [ICML 2025] Efficiently Serving Large Multimodal Models Using EPD Disaggregationβ22May 29, 2025Updated 8 months ago
- Jupyter Notebooks and other code for 4CE data visualizations.β13Jan 25, 2023Updated 3 years ago
- μ½λ‘λ19 λ°μνν© λ³λ λ° μ 곡μ§μ¬ν νΈμμλ¦Ό μλΉμ€(μ§λ³κ΄λ¦¬λ³ΈλΆ μ½λ‘λ19 ννμ΄μ§ λ°μ΄ν° μ΄μ©)β12Jan 5, 2023Updated 3 years ago
- Dev Dive 2022 μΈμ "TDD: λ΄ μ½λμ νμ§μ λμ¬μ£Όλ Type-Driven Development" μ₯νβ12Nov 20, 2022Updated 3 years ago
- Amazon Bedrock μ Nova, Claude 3.7 λͺ¨λΈμ νμ©νμ¬ pdf λλ©΄μ νμ± ν©λλ€.β12May 19, 2025Updated 8 months ago
- BERT score for text generationβ12Jan 15, 2025Updated last year
- π°οΈ Assets for Stationβ13Aug 18, 2024Updated last year
- The main HSIMP fileβ11Jul 28, 2017Updated 8 years ago
- β11Aug 24, 2022Updated 3 years ago
- Leveraging passage embeddings for efficient listwise reranking with large language models.β50Dec 7, 2024Updated last year
- The repo of the Doc2SoarGraph frameworkβ10Sep 17, 2024Updated last year