Writing user stories to drive observability usage

We’ve all worked on projects where despite best intentions monitoring of the platform or application has been shoe-horned in the week before go-live and, as a result, often ends up monitoring the wrong aspects of the platform. So how can we introduce stories to our backlog at the start of application development to make monitoring and observability one of the foundation stones of every project we develop?

Matthew Macdonald Wallace
Contino Engineering
4 min readSep 2, 2021

--

In the past, I’ve spoken about the Observability River and the journey that we all need to take in order to get from “is the platform up?” through to “how are my users interacting with the application”, and today I’m going to build on that to look at how we can integrate the various stages of the monitoring river into our application.

Often as engineers, we (rightly!) focus our development stories on the user interaction. Stories such as “As a user, I want to be able to add an item to my basket, so I can buy it” are not uncommon, but what about the internal requirements for our team and those around us within our own organisation that fall outside the “end-user” customer?

When we talk about monitoring and all the goodness that goes with it, we frequently focus (again, rightly!) on what is important to the development teams and those who are going to support it, however, it’s time that we started to move beyond that small circle of individuals and out to the wider organisation.

What would the change be within your organisation’s approach to monitoring if you started to add in the following stories?

As a developer
I want to see how my application is performing
So I can refactor the appropriate parts of the codebase to improve user experience

This is a story about a process we’re all familiar with — we check the data, then we refactor and see if the data improves — however when was the last time you wrote it down as a user story on the backlog?

As a product owner
I want to know how my application is performing
So I can prioritise the backlog more effectively

Similar to the above, application performance can help product owners, tech-leads, scrum facilitators, and agile coaches move stories around the backlog to prioritise performance enhancements alongside new features — If your agile “leadership” team doesn’t have access to your monitoring metrics, perhaps they should do…

As a marketing designer
I want to understand page load times
So I can ensure our content is optimised appropriately

OK, so perhaps not every marketing exec is going to be interested in page load times, however, they may want to understand how different sizes and formats of images affect how quickly prospective leads can read about your product.

Both image optimisation and compression of assets are known to be reasons for long page load times and, with access to the monitoring dashboards, your marketing teams can start to look at the way they generate media for the site and find the levels of compression/quality loss that are acceptable.

As a sales executive
I want to ensure the website is serving the correct content
So my targeted mailings don’t go unanswered

Let’s imagine that we’re about to send out a sales email to 2,000 existing customers inviting them to sign up for a new service based on their current preferences. That email has a link in it to an asset on the website, and we’re going to track how many customers click that link and who they are using a marketing automation tool.

As part of our pre-send checks we’ll want to confirm that the asset is in place, but what happens if it disappears for whatever reason halfway through sending them emails, or 24 hours after the email is sent out?

Providing the Sales team with observability and insight on their assets or similar suddenly allows them to halt sending out the mailshot, follow up directly with those who clicked the link but didn’t get the asset, or even look at whether people preferred a particular format of the document to another.

With the exception of the first ticket in the list above all of these tickets are aimed at improving observability for the wider organisation, not just the development teams, but when taken together they improve the platform, the application, and the user experience.

What other user stories would you add to improve observability across your organisation? Let us know in the comments below!

--

--

Matthew Macdonald Wallace
Contino Engineering

DevOps, SRE, and IoT consultant specialising in monitoring and observability