Cloud Costs Love-Hate Relationship Advice Using F1 Tech and Bulk Sugar
By Brent E, VP Technology Optimization, Ticketmaster
Interesting Cloud Cost article by CIO
Interesting Cloud Cost article by CIO
I have some advice that has been socialized with hundreds of folks and 100+ enterprises over the years. 8+ years and millions of unique workloads measured and optimized later, I am finding some analogies and tips for folks to start thinking about and using to avoid the cloud and colo cost confusion. This tech agnostic approach flushes out facts and proves you just need to know how to pick and drive the most viable deployment stack for each workload. We have never had so many good viable options to run our workloads on.
Everyone drives a car and we can drive well, however if you have seen the steering wheel of a modern F1 car that is also being augmented by 200 sensors, a 1000 channels of data, 30 to 40 people constantly reading that data over the course of the race weekend in order to improve performances and assisting key functions, you quickly realize you don’t know how to effectively drive the most advanced vehicle today, nor do you have the advanced or expert level skills to be competitive and you end up near last with a ton of waste and bad habits with lots to learn. This is one of the reasons why TMtech released our AWS and DevOps drivers’ license tech maturity scoring approach per product team. You need to learn what to do with all these new amazing buttons, data, switches, capabilities, and pricing models. The data, team, and platform for optimizing suggestions back to the F1 Driver (Developers) is all hooked up already if they follow the on boarding toolkit procedures and start driving. Ready, Fire, Aim as they drive value over time.
"The other key metric here is to divide the business measure of work by the compute cost per VM per month"
The business side of software centric enterprise DevOps shows that we need to understand current costs for tech measures, work measures, and finance measures at a per workload level, then roll up to establish baselines. This new level of visibility and accountability should expose the next usability audit and business justification of each virtual machine (VM) being on in the first place. Ongoing scenario modeling of “should cost” on the three measures (Cost per GB RAM Hour, Cost per Unit of Work per Day, Cost per $1k Revenue serviced/day) for each of the VM’s which are remaining shows the Dev, Ops, and Tech Biz Management teams the already connected viable compute pools to run that workload on to achieve best fit and value tradeoff. Only when you can triangulate the three measures of value on a per workload or per product level monthly, will the reporting data and management to the measures guide you to the land of better TechOps cloud spend value.
In last analogy, imagine you run a bakery and your cost per pound of commodity sugar (Cost per GB RAM Hour) is currently at $6 bucks a pound and using 10 pounds a day for that product. What would you re-prioritize and do in your TechOps and DevOps agenda next week when you find out you are paying $4 bucks a pound too much for all your sugar, and you only need 1 pound a day for that product? That is information econometrics. You are essentially paying for and using your commodity wrong. This is where the 35-50+ percent waste in cloud spending the CIO article mention stems from, and I agree. What conversations would start in your organization if you knew you had a couple teams that hit 90 cents per pound for their sugar, used only half a pound per day increased their product profitability and growth profit ratios massively, and Finance, Operations, AppEng, and TechOps audit the savings and measures from the same data?
Once you have done the per workload scatter plot of Y axis cloud readiness and X axis cloud value, you need to be optimizing both colo and cloud workloads in parallel and start immediately. That’s called opportunity cost. If you plan for a 3 month colo to cloud migration and it takes 6-12 months, you have negative repercussions as well by not completing the migration quickly enough. The other key metric here is to divide the business measure of work by the compute cost per VM per month. “Work” is your activity of stuff done, like page views per day, transactions per minute, jobs per day etc., and you will get your business unit rate cost of work. Once you start to focus your optimization suggestions back to the unit rate sugar costs and business work measures, your developers become a financially aware DevOps org who design to value and user experience. You will have a strategy, measures, and process that guides everyday decisions and an audit function that management will love. You have to watch both cost per GB RAM hour and cost per unit of work to ensure you are doing better each month.
The products and teams that are in public cloud, cloud native from launch, or net new services being designed with emergent tech like serverless and step functions to service revenue today are leveraging the huge service catalog and pricing options from public clouds. Abundance of services, middleware, price models, and core compute will have many more tactics and levers to activate over the limited colo product and pricing catalog upon which I can optimize.
AWS had over 1000 new products come out this last year. There are just so many more buttons, switches, dials, and pricing models to activate in public cloud; especially as you get should-cost modeling responsibility for entering emerging markets, rapid global expansion with DR and massive daily demand flexing of known and un predictable traffic spikes. The opportunity cost of finding an optimization tactic to cut your sugar costs by 40 percent in public cloud is rolled out in a few minutes. The same tactic may take 3+ years to change out in colo due to HW and procurement refresh patterns. The companies and teams that I have met with over the years who are best in class, in highly competitive software sectors, are the ones who took the time to up skill, hire to get the advanced and expert level skills to drive the F1 well, are financially aware (which piece of code triggers X compute costs for Y work per day), design to value, and have separate management folks watching and strategizing to master the technology business management outcomes continuously.