Monitorama 2022 — What an amazing experience!

Matthew Macdonald Wallace
5 min readJul 1, 2022

--

Monitorama is a conference that focuses on Observability, Monitoring, Metrics, Logs, Traces, Dashboards, Incident Management, and just about everything else in the area.

An annual event (apart from during the COVID Pandemic), the conference is a fantastic blend of industry experts, beginners, and vendors sharing their stories either on stage, in the corridors of the conference space, or at the various restaurants, bars, and parks in the surrounding areas.

This was my first time visiting Monitorama, so here (in no particular order!) are the topics of conversation, conference talks, and vendors that caught my attention over the past few days.

The main topic that grabbed my attention was “How do we help our teams/clients write good SLO’s and SLI’s?

Inspired by Sophia Russell’s talk “The Little SLI That Could”, in which Sophia talked about how her team had started decorating their ruby classes with attributes that could then be turned into SLI’s via some custom scripts and other tooling as part of the CI pipeline whilst demonstrating her point using toy trains and the story of “the little engine that could”. This topic continued throughout the week, with various people talking about how similar approaches could be used to provide a wider framework for all kinds of languages and organisations.

A still from a stop-motion video of some toy trains based on the book “The Little Engine That Could”
Sophia’s homemade stop motion video was a particular highlight, explaining the approach she had taken with her team in a fun to understand way

Obviously, I wasn’t going to miss an opportunity to share more information about the OpenSLO and SLODLC projects that I’m involved in, however the relative simplicity with which Sophia and her team had managed to enable developers to create and deploy SLI’s was generally agreed by all to be really quite impressive and more accessible than having to write multiple lines of YAML.

A few of us even started wondering if we could write a Gherkin syntax compiler to describe SLI’s as user stories, so stay tuned to see if that actually happens!

Other conversations ranged from “Do we really need to do EVERYTHING in order to do SRE?” (you don’t), and “Where shall we socialise tonight?” (Momo’s Bar), to “That’s a really cute dog, I really miss my dog…” (it was cute, a disproportionate number of people appear to own dogs at Monitorama), however it was a chance encounter with Steve McGhee from Google that led me to finding out about the embryonic r9y.dev project.

R9y (which stands for “Reliability” just as “o11y” stands for Observability) is a project started by Steve to try and help organisations work out how they can get to the appropriate “number of nines”.

A screen shot of the “tech tree” approach taken by r9y.dev to assist organisations in reaching their desired level of reliability maturity
The r9y.dev Tech Tree

The concept is based on a “tech tree” such as you would find in Sid Meyer’s Civilisation or Kerbal Space Program and requires you to achieve competency in each of the proceeding areas before you can move on to the next and achieve your goal.

This is, of course, after you’ve helped the organisation you’re introducing this to understand that “five nines” (99.999% uptime) is probably not what they actually need, and they can usually get away with 99.95%.

I can’t wait to get more involved in the r9y project and help it progress and become even more useful to our customers!

There were many more conversations about how we can present data to customers, with Fred Moyer showing some really simple ways to break down SLI’s and SLO’s into their component parts, leaving many in the audience (including me!) scratching their heads and wondering why they hadn’t thought of doing it that way!

A photograph of a slide from Fred Moyers talk at Monitorama, in which he separated out the various components of an SLI, and SLO, and an error budget by colour, making them a lot easier to understand.
The colour coding and clear separation of Fred Moyer’s slides were the envy of many!

Corey Quinn also delivered a devastating talk on whether to build or buy, with the answer being “it depends”, however for me the main takeaway was that egress traffic is really expensive, so every time your applications in AWS send logs or metrics to your cloud hosting provider, you start running up charges on your bill.

Couple that with the fact that most SaaS offerings make requests to the CloudWatch API, and each of those requests is chargeable above a certain level, so it’s very easy to find that the highest part of your AWS bill is actually the calls from outside your applications and infrastructure from your Observability stack — something to keep in mind in future!

A photograph of a slide from a presentation — the slide is split into “build” on one side and “buy” on the other. The content on both sides is the same.
Build vs Buy? It depends…

I also want to say thank you to all the vendors that I spoke to. All of them engaged with me even though it was clear as a consultant I wasn’t their immediate target market, and pretty much everyone I spoke to was keen to come and speak to our engineers and consultants at Contino to help us understand where their product might help our customers.

They also gave me so much swag I’m going to struggle to get it all home again!

A table in a hotel room covered in socks, stickers, food, and books collected from various vendors throughout the conference
Socks, T-Shirts, Stickers, Coffee, Tea, and even flavoured salts for cooking — I’ve no idea how I’ll get it all in my carry-on!

Finally, I want to say a massive thank you to everyone who was involved in organising and running Monitorama this year, especially Jason Dixon and Pete Cheslock, both of whom did a fantastic job of keeping it running smoothly.

The catering team made sure that we had more than enough food and drink each day (I’m pretty sure I’m going back at least 2lb heavier than I arrived as a result!), and the team at Portland Center Stage provided us all with a safe, comfortable space in which we could all enjoy ourselves.

So what’s next? I’ll be exploring r9y.dev more, working on improving OpenSLO and the SLODLC based on all the things we discovered and spoke about at Monitorama, and who knows, I might even update The Observability River as a result! For now, I’m off to find my flight back home to the UK.

Thanks Portland, you’ve been amazing, I’m looking forward to coming back for another Monitorama in the very near future!

--

--

Matthew Macdonald Wallace
Matthew Macdonald Wallace

Written by Matthew Macdonald Wallace

DevOps, SRE, and IoT consultant specialising in monitoring and observability

No responses yet