Implementing SLA in Agile model

Hi all, I manage a support team in a company where the Agile model is used. The problem is that loads of tickets remain open because developers don’t have time to fix bugs and this results in users being frustrated. I would like to implement SLA for ticket resolution to ensure that incidents and requests are resolved in a timely manner but from what I understood, SLA is not really in line with the Agile way. So should we set targets and guidelines in terms of resolution time or should we implement formal SLA that is binding?

Thanks and Regards,
Sandeep.

Is there a product manager prioritising the work? It’s reasonable to expect the product/dev team to at least triage the bugs within an SLA timeframe, but actually getting them fixed will require the product manager to trade off fixes vs other features.

Well,

  1. This doesn’t seem related to Elixir :slight_smile:

  2. Hopefully, you can have a meeting with the product manager(s) and the entire dev team and discuss your concerns and see how everyone feels about it and what might be done about it. Maybe agile teams have regularly-scheduled retrospectives. If your team does too, that might be a good place to bring it up.

  3. If it was my team, I’d have a policy of fixing all outstanding bugs before working on any features. Not only does this make a better product for users, it reduces the cost of repeatedly dealing with the same bug reports,
    and it reduces the development cost since developers will inevitably stumble across the bugs while programming, investigate the causes, and then eventually realize that it’s
    a known bug, leaving the broken code around for the next developer to repeat the process again.

3 Likes

This is actually common practice: bug fixes first. If somebody’s telling OP that Agile does not allow this, they are wrong.

I’ve also had the unfortunate experience of working on a very large legacy application with so many bugs that fixing them all would have halted feature dev for years. In those situations I prefer to close low impact bugs as wontfix since in all likeliness the bug report would sit at the bottom of a backlog forever.

Yep, sometimes it’s already in the position of “too many bugs” before you get there, precisely because bugs were ignored for too long. You can’t go back in time and change that; you have to compromise to make any progress.

And sometimes even on new development, where you’re taking a firm “we fix bugs” stance, a bug will be too big to tackle right now.

I’ll chime in to add to this point, which is absolutely true when dealing with really bad legacy codebases.

If you don’t have a way to measure what is worth fixing, you’re wasting time on purpose.

It would be a better use of your time to plug in some analytics and metrics and gauge what is actually impacting your customers/stakeholders.

Further to the point made by @mbuhot, I’d go so far to say that if the support team and the delivery team don’t understand why features are being prioritised above bugs, you have a product management / ownership problem, not an SLA problem. It can be perfectly valid to priorities features over bug fixes in certain circumstances (e.g. minor annoyance bugs vs major value add feature), but everyone involved should really understand the trade-offs being made. The product owner / manager owns those trade-offs and should be communicating them.

As far as SLAs go, it is quite straightforward to implement them in an agile mode, but, for reasons I’ll cover below, it only really makes sense for external SLAs with customers. Suppose you have 1 week sprints, you could structure your SLAs as follows:

  1. Critical / Blocker defects -> Kill sprint, fix defects ASAP, get back onto normal backlog work next sprint.
  2. Major defects -> ensure SLA allows you up to 2 weeks to resolve. Put defect at top of backlog for next sprint. Worst case it gets reported just after a sprint starts and it takes just under 2 sprints to deliver a fix.
  3. Minor defects -> ensure SLA makes no commitments for fixes. Support provides workaround or “hug” to the customer, defect gets prioritised along with any other feature or marked as WONTFIX. These usually fall off the list unless enough customers complain about them (or you are trying to upsell to the customer and you need to be particularly nice to them).

It’s good practice, in my view, to allow around 20% of the delivery team capacity for triaging / workaround / emergency fixes so you may not have to kill a sprint to handle a critical / blocker issue and you can handle “business as usual” escalations from support. If it isn’t needed you just pull the next feature off the top of the backlog into the sprint.

Back to SLAs between support and development… Support and development are part of the overall system of converting capital and manpower into value. Other parts of the system include strategy, sales, account managers, marketing, finance etc. To do “agile” nicely you need all these functions talking together to prioritise work correctly.

Is the company strategy to become regarded as the best quality provider in the industry, or is the strategy to ship features fast in a new market land grab? Are account managers losing renewals because of defects? Are sales losing deals because of defects? Or are there just a handful of cranky users across thousands of customers that make a lot of unwarranted noise? Are sales pursuing an upsell opportunity in an unhappy customer where a little bit of love would go a long way? Do marketing have a big “feature-release” event planned and paid for resulting in a hard deadline for a new feature? Are the cranky users in customers that are aligned with the overall business strategy, or “legacy” customers? Does dealing with defects result in extra resource requirements in support? These are whole of business questions. Once you have answers to these it becomes pretty straightforward to prioritise.

You are honestly better off getting clarity across the business on what’s important rather than imposing SLAs on your colleagues.

EDIT: Finished last sentence after NBN drop-out lost it (Aussies will understand…)

1 Like