Trainy-ai / konduktor
cluster/scheduler health monitoring for GPU jobs on k8s
☆41Updated this week
Related projects: ⓘ
- Fine-tuning and serving LLMs on any cloud☆85Updated 9 months ago
- Profiling tools for distributed training☆37Updated 10 months ago
- WebAssembly dev environment for Envoy Proxy. Iterate on your HTTP/TCP middleware in seconds!☆54Updated last year
- Cedana: Access and run on compute anywhere in the world, on any provider. Migrate seamlessly between providers, arbitraging price/perform…☆53Updated 5 months ago
- A simple DAG for executing LLM calls and using tools.☆37Updated last year
- Orchestrated process and container checkpointing☆61Updated this week
- Action library for AI Agent☆187Updated last week
- ⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.☆125Updated 3 months ago
- Run GPU Workloads Across Multiple Clouds☆385Updated this week
- Finetune LLMs on K8s by using Runbooks☆168Updated 3 weeks ago
- Felafax is building AI infra for non-NVIDIA GPUs☆302Updated this week
- A simple Pure Python/PyTorch performance daemon for training workloads☆14Updated last year
- Self-hardening firewall for large language models☆254Updated 6 months ago
- This is a landscape of the infrastructure that powers the generative AI ecosystem☆123Updated 3 weeks ago
- visualize your gpu usage☆17Updated last year
- Synthetic Data for LLM Fine-Tuning☆78Updated 9 months ago
- A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.☆45Updated 5 months ago
- LLM fine-tuning and eval☆340Updated 6 months ago
- AutoEvals is a tool for quickly and easily evaluating AI model outputs using best practices.☆150Updated this week
- The only Vector tooling you'll need. Star the repo and look out for an email to try out a brand new Vector Data Exploration demo! Use the…☆195Updated this week
- Run GGML models with Kubernetes.☆172Updated 9 months ago
- Sister project to OpenLLMetry, but in Typescript. Open-source observability for your LLM application, based on OpenTelemetry☆259Updated this week
- Agent accuracy measurements for LLMs☆201Updated 3 months ago
- Data-Driven Evaluation for LLM-Powered Applications☆432Updated 2 weeks ago
- Text analytics for LLM apps. Cluster messages to detect use cases, outliers, power users. Detect intents and run evals with LLM (OpenAI, …☆352Updated this week
- Prompt engineering, automated.☆201Updated this week
- Runner in charge of collecting metrics from LLM inference endpoints for the Unify Hub☆16Updated 7 months ago
- 🐦 A open blazing-fast simple model gateway for rapid development of production GenAI apps☆123Updated last month
- Private Open AI on Kubernetes☆298Updated this week
- Open source AI on-call developer 🧙♂️ Get relevant context & root cause analysis in seconds about production incidents and make on-call …☆241Updated this week