res-eng / incident-writeupsLinks
A curated list of well-written publicly available incident writeups
☆13Updated 6 years ago
Alternatives and similar repositories for incident-writeups
Users that are interested in incident-writeups are comparing it to the libraries listed below
Sorting:
- ☆70Updated 6 years ago
- AWS EBS-EC2 attach utility. UNMAINTAINED, SEE FORK ->☆29Updated 2 years ago
- Dogscaler scales up AWS autoscale groups based on the results of a datadog query.☆16Updated last week
- Example code for the blog post on auto-joining a Consul cluster on AWS EC2.☆63Updated last month
- This is a collection of readings, talks, and other bits regarding the field of Resilience Engineering☆225Updated 7 years ago
- Expose AWS service usage and limits to Prometheus☆47Updated last month
- A small helper to generate Honeycomb traces from CI builds☆229Updated 4 months ago
- Literature Review for Fault Detection in Distributed Systems☆60Updated 8 years ago
- A collection of Twilio SRE's Gameday Templates☆140Updated 5 years ago
- A tool for flashing OS images onto stateful servers☆47Updated 5 years ago
- Automatically rebalance your kafka topics, partitions, replicas across your cluster☆48Updated 8 years ago
- A daemon for responding to AWS AutoScaling Lifecycle Hooks☆147Updated 3 weeks ago
- A GitHub App that uses kubeval to validate all of that Kubernetes YAML in your repo☆94Updated 4 years ago
- Manages attachment of EBS and ENI pairs in AWS EC2 auto scaling groups☆28Updated 7 years ago
- Periodically run a command and exports its return code as a prometheus metric.☆118Updated 2 years ago
- The Consul-Native Service Mesh☆64Updated 7 years ago
- A @HashiCorp Terraform provider for managing Google Calendar events.☆137Updated 5 years ago
- Kubernetes Resource Explorer☆135Updated 7 years ago
- Nginx based Kubernetes ingress controller for AWS☆58Updated 2 years ago
- A Chaos Engineering Bootcamp☆173Updated 7 years ago
- The Agile Operations methodology☆145Updated 2 years ago
- Better Living Through Statistics: Monitoring Doesn't Have To Suck☆162Updated last week
- Images and links to references for Kafka Fault Tree Analysis talks by Andrey Falko☆21Updated 4 years ago
- Monitoring companion for Nomad periodic jobs and Cron☆58Updated 3 years ago
- Terraform InSpec Provisioner Plugin☆68Updated 7 years ago
- What I wish I knew before going oncall☆12Updated 6 years ago
- Firehose all nomad job, allocation, nodes and evaluations changes to rabbitmq, kinesis or stdout☆117Updated 5 years ago
- Simplistic chaos engineering tool for kubernetes application resilience testing☆37Updated 2 years ago
- repositories of my talks☆278Updated 6 years ago
- A Kubernetes operator for managing CloudFormation stacks via a CustomResource☆100Updated 3 years ago