Staging

“Death to staging”. “Staging lies”. “Staging is worst than it works on my machine”. “WTF!”

I’ve heard those a lot. And, in a way, they are all true. A snowflake staging environment is bad. The reason? Because production has its unique characteristics in terms of topology, data and a million other details that go from uptime of the machines, to pretty much anything else. Put like that, no other environment other than production makes sense. I believe this way of seeing things is missing a key point.

On staging, the word

The use of the word “staging” triggers bad feelings and for that reason it should not be used in a discussion lightly if we know that not all participants are on the same page. Staging triggers all the feelings described in the introduction of this blogpost. Its definition however, is not as bad:

2 a stage or set of stages or temporary platforms arranged as a support for performers or between different levels of scaffolding

And the definition of “staging area” is even better:

a stopping place or assembly point en route to a destination

Staging, the word, is like “legacy”: supposed to have a relatively positive meaning, left with a lot of negative ones, mostly because of the scars we all have operating systems. But it’s really just a stopping place en route to a destination.

A stopping place en route to a destination

Staging is not the final destination. Staging is a stop in the journey to build confidence on the quality of a change. That’s the key here: confidence on a change is built incrementally. We start when we write unit tests: we learn if a few units, in isolation, work on our machine. Then we push those and we learn if those units work in CI. Then we write integration tests, we do manual end to end testing and so on. We keep moving changes forwards until we reach production. How we reach production is completely up to us: slowly over the course of days or all at once. By testing changes in a different environment first or not. By feature flagging or not. There are a lot of combinations and different needs, but confidence in a change is an incremental process that tends to 100% only when the change is 100% enabled in production and serving traffic. How we combine those strategies depends on the needs of the application, the cost of testing and it’s ultimately a matter of tradeoffs that vary case by case.

Staging is dead…

The “staging environment” as the one environment alternative to production in its own slowflake configuration is almost useless and should be avoided. It will likely give a false sense of confidence in changes, especially if not approached with the right care. As time and money investment is probably the wrong one: I’d rather invest in getting changes safely to production and testing actively in production with things like feature flags. But it can’t be denied that using non-snowflake pre-production environments, we gain the possibility to test changes in an integrated environment before hitting production. And that’s good, as long as we understand that testing in staging will not catch all the possible bugs. Note that there are environments that are extremely complex for which is close to impossible to build a comprehensive staging environment. In those cases, maybe a staging environment is not the right thing. And maybe it’s worth questioning too if we really need all of that complexity.

… long live to dynamic pre-production environments.

Stigmatizing words is something that happens pretty frequently in the tech world. When this happens, we adapt our language to not be misunderstood and to be sure to get the best out of the conversations we have. Assuming we are not misunderstood, what we want is clear: to gradually increase our confidence in changes removing all the possible unknowns. Testing and conducting experiments in production is key. Using environments that are not production to test those changes is important as well and those tests are unlikely to backfire if the environments are not snowflakes and if we approach them with the assumption that there is no environment identical to production other than production.