Resources for Large Language Model Inference
☆17Dec 29, 2023Updated 2 years ago
Alternatives and similar repositories for llm-inference
Users that are interested in llm-inference are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- AI model designed to test the effectiveness in handling external ethical attacks.☆11Feb 9, 2026Updated 4 months ago
- ☆19Oct 18, 2025Updated 8 months ago
- lightsocks client implements by golang☆13Sep 11, 2015Updated 10 years ago
- ☆23Dec 18, 2023Updated 2 years ago
- MARNNs Can Learn Generalized Dyck Languages☆12Nov 11, 2019Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- CJOBS☆16Dec 22, 2018Updated 7 years ago
- TUI-klient för SVTs text-tv skriven i Go.☆23May 28, 2021Updated 5 years ago
- ☆17Jan 1, 2024Updated 2 years ago
- Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.☆16Aug 31, 2023Updated 2 years ago
- ☆13Apr 14, 2026Updated 2 months ago
- Official Implementation of Knowledge Flow Prompting☆35Oct 20, 2025Updated 7 months ago
- NeurIPS 2026 paper: The Geometry of Consolidation — follow-up to HIDE and No-Escape.☆110May 5, 2026Updated last month
- Bayesian AB Tests Examples☆22Apr 25, 2022Updated 4 years ago
- ☆13Jun 5, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- 演示 vllm 对中文大语言模型的神奇效果☆31Nov 4, 2023Updated 2 years ago
- A list of compatible datasets, noting other major repositories containing popular real-world datasets, along with sample code for a range…☆12Oct 13, 2019Updated 6 years ago
- A Qwen .5B reasoning model trained on OpenR1-Math-220k☆14Oct 11, 2025Updated 8 months ago
- Statically linked arm64/aarch64 binaries for everyday needs☆31Oct 22, 2025Updated 7 months ago
- VenomPred 2.0 API☆11Feb 4, 2026Updated 4 months ago
- Nexus is a Rust-based CLI tool for analyzing and visualizing code dependencies in source files.☆18Nov 9, 2025Updated 7 months ago
- build llama inference compute from scrath, only using torch/numpy base ops☆16May 5, 2026Updated last month
- Official implementation: Large Language Models are Interpretable Learners - Google☆13Jun 29, 2024Updated last year
- Python package for P2 (Path Planning), a masked diffusion model sampling method for sequence generation (protein, text, etc.).☆23Aug 19, 2025Updated 9 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Manage PCI Devices and PCI Device Claims for PCI Passthrough in Harvester☆20Jun 5, 2026Updated last week
- This app creates or read parquet dataset☆32Apr 24, 2025Updated last year
- PyTorch for RISC-V Architecture on OpenEuler 24.03☆13Jun 27, 2024Updated last year
- The frontend app of Mailcow's CowUI web interface☆12Apr 29, 2024Updated 2 years ago
- ☆73Updated this week
- The Ollama Toolkit is a collection of powerful tools designed to enhance your experience with the Ollama project, an open-source framewor…☆33Oct 29, 2024Updated last year
- XLA integration of Open Neural Network Exchange (ONNX)☆19Aug 17, 2018Updated 7 years ago
- ☆14Jan 8, 2025Updated last year
- ☆10Dec 4, 2019Updated 6 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A comprehensive time-series dataset survey☆20Aug 1, 2022Updated 3 years ago
- Implementation of Accurate Multivariate Stock Movement Prediction via Data-Axis Transformer with Multi-Level Contexts☆33May 1, 2022Updated 4 years ago
- ☆11Mar 16, 2026Updated 3 months ago
- User Simulation for Conversational Recommendation☆19Jan 30, 2026Updated 4 months ago
- Simple example of autonomous research ran in parallel from my Aetherius Ai Assistant project. Uses Openai's GPT-3.5, GPT-4, and Microsof…☆15May 11, 2023Updated 3 years ago
- 基于 Next.js 开发的塔罗牌服务☆16Dec 25, 2024Updated last year
- Official code release for Ambient Protein Diffusion☆35Aug 30, 2025Updated 9 months ago