What I Learned Watching Teams Automate Without a Center of Gravity

2 minute read

It has been a little over three years since I last wrote anything here, and the biggest change in that time is not a tool or a platform. It is what I do day to day. I used to build the automation myself. Now I lead the engineers who build it.

I will be honest. I miss the hands-on work. There is a specific satisfaction in writing the playbook, watching it run, and fixing the thing it broke. These days I am more often in the discovery and design phase, in presales conversations helping a client figure out what they actually want before anyone writes a line of code. I lean on a team of engineers who are genuinely better at the development than I would be if I were still doing it full time. That is the right trade. It still stings a little.

What I have gained is a wider view. When you build one team’s automation, you see one team’s problems. When you lead across a dozen engagements, you start to see the same failure mode over and over, and it is almost always the same one.

Here it is. Teams automate without a center of gravity for their standards. One group picks up Ansible and writes roles their way. Another group across the org picks up Ansible and writes roles a completely different way. A third bolts on Terraform with its own conventions. Six months later there are fifteen different ways to do the same thing, no shared modules, no agreed naming, and nobody who owns the platform. The automation that was supposed to reduce toil has become its own source of toil. The thing meant to fix the entropy is now generating it.

This is the part I keep coming back to with clients, and it is why I push the Automation Center of Excellence harder than anything else I recommend. I know “Center of Excellence” sounds like a slide somebody made to justify a reorg. That is not what I mean. A CoE is the answer to a real problem I have watched play out at scale, across large-scale SRE engagements and enterprise client work. It is the place where the standards live. Someone owns the shared modules. Someone decides how things get named. Someone says no when a team wants to reinvent a wheel that already rolls fine. That is it. It is not bureaucracy. It is the difference between automation as an asset and automation as fifteen competing dialects nobody can maintain.

Most of the work itself is hybrid cloud. On-prem and cloud together, Ansible and Terraform together, and a client who needs the same outcome on both sides without two separate teams maintaining two separate truths. That is where the standards question gets sharp, because a hybrid environment punishes inconsistency faster than a single platform ever will.

If you are automating right now and you cannot point to who owns your standards, that is the gap to close first. Not the next tool. Not the next platform. The center of gravity. Everything else falls apart without it, and I have watched it fall apart enough times to stop treating it as optional.

It is good to be writing again.

Comments