Jason Barto discusses fault isolation boundaries and ways to take advantage of fault isolation in AWS, demonstrating initial tests used to ensure a system has successfully isolated faults.
There is no such system that will not fail, so the question - how?
Talks about failure domains, fault isolation and testing it (using chaos engineering)