res-eng / resilience-for-software
Introduction to resilience engineering concepts for software engineers
☆69Updated 5 years ago
Alternatives and similar repositories for resilience-for-software:
Users that are interested in resilience-for-software are comparing it to the libraries listed below
- Anonymised database instances as-a-service☆46Updated 2 months ago
- A collection of the papers, conference talks, articles, blog posts, interesting Twitter threads, HN/reddit comments on systems engineerin…☆568Updated 5 years ago
- Examples of OS / system limits☆308Updated 4 years ago
- Systems and failure reading list☆198Updated 3 years ago
- Messiness reading list☆57Updated 3 years ago
- ☆69Updated 5 years ago
- This is a collection of readings, talks, and other bits regarding the field of Resilience Engineering☆227Updated 6 years ago
- ☆47Updated 5 years ago
- Prototype implementation of Service-Level Fault Injection Testing in Python.☆70Updated 2 years ago
- A list of resources for engineering managers of all levels☆80Updated 6 years ago
- A globally-distributed, eventually-consistent, 100% available key-value store ;)☆127Updated 2 years ago
- Golang Automatic remediation☆33Updated 6 years ago
- Documents and resources for the "Learning from Incidents in Software" slack workspace.☆39Updated 4 years ago
- ☆129Updated last month
- Queueing system simulator☆52Updated 8 years ago
- A lightweight metrics explorer for Prometheus, with a focus on on-the-fly analysis.☆27Updated 2 years ago
- An access-limiting stateless GitHub API Proxy☆149Updated 2 years ago
- Tutorial "Weeks of debugging can save you hours of TLA+". Each git commit introduces a new concept => check the git history!☆492Updated 5 months ago
- A collection of Twilio SRE's Gameday Templates☆140Updated 4 years ago
- A key/value database inspired by chapter 3 of Designing Data-Intensive Applications by Martin Kleppmann.☆26Updated 2 years ago
- A sample of major outages and incidents☆18Updated 5 years ago
- A horizontally scalable NGINX caching cluster☆137Updated 3 years ago
- Keyless SSH Agent for IAM Entities☆21Updated last year
- systems is a set of tools for describing, running and visualizing systems diagrams.☆360Updated 2 years ago
- Growing up tech leads: notes on running a skill share for your peers.☆180Updated 6 years ago
- Experimental evaluation for the Partisan paper at USENIX ATC 2019.☆22Updated 5 years ago
- ☆29Updated last year
- Slides and resources for talks on partition tolerance☆33Updated 6 years ago
- Control Data Store☆270Updated last month
- Automate and expose complex infrastructure tasks to teams and services.☆121Updated last week