
Superficial Blamelessness
In 2012, John Allspaw (then CTO of Etsy) wrote a seminal blog post on the need for what he called Blameless Postmortems. Built off the notion of a “Just Culture” from the research of Sidney Dekker, he summarized Etsy’s approach to balancing accountability in post-incident reviews with a need for avoiding blaming individuals who were involved in the incident:
“Having a "Just Culture" means that you’re making [an] effort to balance safety and accountability. It means that by investigating mistakes in a way that focuses on the situational aspects of a failure’s mechanism and the decision-making process of individuals proximate to the failure, an organization can come out safer than it would normally be if it had simply punished the actors involved as a remediation.”
He goes on to explain how engineers involved in incidents can be encouraged to give detailed accounts of what happened without fear of punishment or retribution. The word blamelessness took on a life of its own and became slightly disconnected from Dekker’s research and John’s blog post. Further refinements such as blame-awareness came in to add some nuance to the ongoing discourse as well.
For many organizations, some form of blamelessness has become a more standard practice and blame-awareness has been gaining in popularity. However, there is an anti-pattern I have noticed as well, which I like to call superficial (or shallow) blamelessness that I think is important for people to be on the lookout for.
Why Try to Be Blameless?
Resilience Engineering, and many of the connected disciplines it borrows from, aims to leverage incidents and the energy put behind them to better understand how a system works. The idea goes, the better you understand the dynamics in play, the less likely you are to make misguided suggestions, to operate on false assumptions, and to create accidental harm (among many other things).
Disciplines that look at complex systems prefer to focus on interactions between components rather than specific parts themselves (what many consider to be the “root cause”): individual units can be working fine—reliably and in accordance with their goals—but still end up behaving surprisingly as an ensemble.
Looking at the parts in isolation may work well enough when your systems are simple: find what broke, replace or fix or tweak it, and move on. But as they scale up and become more connected to other systems, they also necessarily become more complex. As a result, the way we construct explanations about those systems has to change as well. Looking at so-called “faulty” individual components or people that need to be corrected provides diminishing returns and loses its effectiveness.
Eventually, this is why blame hurts. Blame fundamentally implies that someone or something misbehaved, that they are responsible for causing the outcomes. Blame, as a concept, invites retribution. And retribution in turn drives people involved in adverse events to protect themselves, which means it becomes harder to learn from what happened.
So not only do we get a focus that is centred on individual units, we make it harder to ever get a better understanding on top of it. This is where concepts such as blamelessness (or blame-awareness) come from: we realize that blame is an expected reaction to surprising negative events, but also know that post-incident responses that lean into these feelings do not tend to be effective in the long run.
But being blameless isn’t purely a concept for justice to ensure we don’t punish people for being put in a tough situation by a system that doesn’t even realize it—although it certainly can help—it is also a concept that invites a stance that is systems-oriented.
You don’t just avoid blame. It’s a natural feeling to feel blame, but you have to go past it and shift your perspective towards the system “owning” the situation.
What Makes Blame Superficial
Superficial blamelessness will be easiest to spot when retribution is still the norm, whether officially or only informally. People will avoid naming employees or even teams, and internal reports will look as non-specific as if they were intended for an external audience. But even if you take away the “name” bit from the cycle of name → blame → shame, the cycle lives on.
Organizations where blame is less “ambient” or better harnessed will instead consider it safe to name names, and will invite a conversation with the people involved to get their perspective—one they do not have as a bystander—to get a richer view of how the system works. But even in these places, where it might be safe to give your account of what happened, there’s not yet any guarantee that you’re getting what you should out of it all. Blame is only one of many blockers on the way to a better systems perspective.
Be wary of successfully avoiding retribution, yet finding your post-incident process still biased towards an individualistic stance instead of a systemic one. Consider, for example, whether suggestions of what to improve after adverse events mostly focus on what specific people involved need to do better, even without punishment. Common ones are:
-
Provide more training
-
Adjust a worker’s behavior
-
Write a specific patch or test
-
Be reminded to “pay more attention”
-
Add more reviewers or supervision to catch mistakes
-
Tweak the process or procedures directly related to the incident
The locus of intervention, in these cases, is oriented towards specific people or components. Some of these might be useful and helpful, but if they’re most of what you get, you can suspect a systemic point of view isn’t strongly developed yet.
More systemic interventions are those that would change the conditions that lead to challenging situations. You could, for example, change or clarify broad pressures and goal conflicts. Likewise, the behavior you didn’t like in the context leading up to an incident might be valued in “normal” circumstances. Software development, after all, can have contributing factors that come from other departments too.
You might also find items such as tweaking processes or procedures that go beyond the circumstances of the incident. Maybe someone found out that relevant and useful information came up in informal discussion groups: those could perhaps be encouraged more, or the recipe expanded to other areas important to the organization, because they can help foster a culture that can deal with surprises better. If you find other teams have solutions that exist already, it’s worth asking if they could be borrowed and propagated, but also what could make it easier to figure out if there are ideas worth borrowing. And why not go further and find out under what circumstances that other team figured it out to see if there’s something to learn?
Control-centric vs. Empowerment-centric Approaches
The distinction between systemic and individualistic is usually a good marker, mostly observable once you are looking at a completed review process and its list of actions. It might not be visible yet, because the process isn’t over or continuous, and it might also not be simple: people do what they can given the circumstances.
Another angle in which you can frame this comparison between individualistic and systemic elements is going to be based on an opposition between control and empowerment. If you are approaching the problem from a stance where you need to restrict and control what people do to keep it within specific parameters because they are not operating as you expect them to, then you are taking a control-centric approach. Ask yourself whether this isn’t a subtle way to be blaming people for the problems they’ve been asked to solve.
An empowerment-centric approach would instead ask how you can make the work simpler, safer, easier, clearer, or more manageable. Involve the people impacted in the process, not because they need to do better, but because the system needs to make sure they’re operating in better conditions.

Fred Hebert
Staff SRE
