6 articles tagged #sre.
Practices for configuring effective alerts that notify real problems without generating fatigue from excessive notifications.
Culture and set of practices that unify development (Dev) and operations (Ops) to deliver software with greater speed, quality, and reliability. It's not a role — it's a way of working.
Set of technical and cultural practices that implement DevOps principles — from Infrastructure as Code to blameless post-mortems. The "how" behind the philosophy.
Processes and practices for detecting, responding to, resolving, and learning from production incidents in a structured and effective way.
Discipline applying software engineering principles to infrastructure operations, focusing on creating scalable and highly reliable systems.
Framework for defining, measuring, and communicating service reliability through service level objectives (SLOs), indicators (SLIs), and agreements (SLAs).