Category Archives: Uncategorized

A systematic view on Developer Productivity and Transformation progress

How do we measure the productivity of developers? This is a riddle that I have been working on for a while (see this post). And how do we measure transformation success in the same context?

If we had good answers to these two questions things would become a lot easier for us in the DevOps community. At the DevOps Enterprise Summit 2022 I had many great conversations in the hallway track to get different perspectives (special mention to Hans Dockter and Bill Bensing for entertaining a good discussion amongst many others), which allowed me to get some clarity in my head. I will share where I got to with you all in this post, which might be a bit on the longer side.

Measuring Developer Productivity = Transaction Cost

Developer productivity is extremely difficult to measure; lines of code, story points, etc all fail to really measure it. But what many people can align on is that you can measure toil – the work that does not add value in the SDLC. One good measure for toil is the transaction cost of code (see more on that here) – one proxy way to measure this is the time between the code commit and the deployment and validation in production. So if we speed up this feedback cycle we can use that as a first-order proxy for the transaction cost. Let’s use this for this article.

The goal of Transformation is to achieve optimal Transaction cost

One major goal of any DevOps transformation is to improve developer productivity and remove toil. This will speed up the delivery cycle and as a result, likely reduce cost. If we keep or improve quality standards along the way we are scoring some major goals for the organisation.

The optimal transaction cost is a theoretical construct – it represents the minimum possible time and effort it could take to deploy and validate code in production. All the manual steps like manual testing, approvals, deployments, and infrastructure changes increase the cost above the minimum transaction cost. Automated but not optimised steps like unnecessary scans or build processes also increase the transaction cost. You can assume that we will never achieve the optimal transaction cost as the incremental cost to improve transaction costs further become uneconomic at some stage. Transaction cost improvements as part of transformations usually follow an asymptotic curve – or the law of diminishing returns.

Optimal Transaction Cost is a function of architecture and process/governance

So what determines the optimal transaction cost for an organisation. Well first of all the optimal transaction cost is not determined organisation wide. The technologies in use and the architecture are factors that differ across teams in an organisation. It’s easy enough to see if you think that a packaged software application likely compiles in minutes, a custom app in seconds, and serverless functions not at all. Clearly, this will make a difference in the transaction cost that we can achieve. But it is not technology and architecture alone that determines the optimal transaction cost per application. The processes and governance in place across the organisation will play the other major role in determining the optimal transaction cost you can achieve as a team.

You might notice that I consider architecture and governance/process as static in the calculation of optimal transaction cost. I do this because I believe transformations to be first and foremost something that teams drive – if you saw my talk at DOES 2022 I speak about the “choose your own adventure DevOps transformations” of teams that you need to guide if you want to transform whole organisations. I will talk about how to elevate the game across the organisation in the next section. But before I do that let’s look at what team-driven transformations then look like.

This graph shows two applications one being a larger packaged software, and the other being a smaller custom application. The optimal transaction costs are different as mentioned.

Here you can see how each team is making progress. It’s not linear, and it’s not always making it better, but overall the teams are getting closer to the optimal transaction cost. As I said before it’s not expected for them to ever achieve it and there are other things they are improving along the way too (like quality standards or effectiveness of business experiments).

Transformation progress is achieved through 3 attack angles

Now that we have done the groundwork, let’s talk about the business of transformation. With all this in mind, we can better explain what we are trying to achieve and how to measure it. Let’s assume we found a way to measure transaction costs by either implementing some kind of delivery telemetry or by using a less tool-focused approach like value stream mapping. We then have three different “attack angles” for our transformation:

  1. Achieve optimal transaction cost through automation and streamlining
  2. Improve optimal transaction cost by changing the process/governance across the portfolio
  3. Improve optimal transaction cost by changing the architecture

Let’s go through them in more detail.

Achieve optimal transaction cost through automation and streamlining

This is the team-level transformation where much of the transformation energy needs to be. Teams need to look at their context (perhaps create a value stream map) and identify two things: The areas they can improve themselves and the higher-order constraints that need to be solved outside their team. And then they need to get on with the first while making sure that the second is appropriately prioritised at the organisational level. At this level, we will increase test automation, build infrastructure automation and improve the deployment process among other things. This is very much what I showed in the earlier graph.

  1. Improve optimal transaction cost by changing the process/governance across the portfolio

Even if everything is fully automated there is still a process of governance to go through – often “hoops” and “red tape” can materially add to the transaction cost and need to be reviewed to see what still adds value and what is “toil”. Given process and governance is usually company-wide, the process cannot be optimised for each application. This means that even if there is a multi-speed model (and there should be if you followed the logic of this post), each speed will still have to cater to the slowest mover in the category. As application uplift their capability, the process and governance need to evolve at the same time. But changes to the governance process can improve the optimal transaction for many applications making it a very powerful but difficult change.

  1. Improve optimal transaction cost by changing the architecture

The optimal transaction cost for a specific technology is determined by the level of automation and the achievable speed – as mentioned this is different for example for a small custom app vs a large packaged application. The level of dependencies in the architecture will be another determining factor. So how can we push ourselves to the next level? By changing the architecture – either through decoupling or by transitioning to a different technology. Once you change to a new transaction cost level, there might be new things to do regarding automation and streamlining processes, and new options regarding process and governance might become possible. From here on in it is in progressive iterations that progress is made.

Interactions between 2 and 3 – it’s an iterative process

I think it is obvious that these three attack angles happen in all permutations, parrel or sequential and once you have made improvements along one of those angles new opportunities arise along another. So you need to keep working on your transformation at both the team and organisational level and use a transformation “compass” like a balanced scorecard (see more below and here) or other guidance to keep pushing the transformation forward. Just keep in mind that you need to have an economic framework to guide you – at some stage before you reach the ideal transaction cost you will likely encounter a plateau from which there is no economically sensible next step. Don’t worry – this will change again in no time as the environment around you changes and new opportunities and challenges arise. Personally, I have encountered such plateaus at the team level but never at the organisational level, there is always something beneficial to improve.  

When I talk about transformation I usually use a balanced scorecard approach and clearly the perspective described in this post predominantly sits in the development efficiency quadrant. It is a very important quadrant but not the only dimension you want to cover in your transformation. I will explore the other quadrants in due time in other blog posts.  

By the way for a fantastic story about improving the top left quadrant check out Fernando’s Adidas story at DOES 2022.

Testing is Risk Management – not Insurance

In the past, we treated testing or quality assurance in IT delivery as something akin to insurance. We test as much scope as possible in traditional waterfall delivery before releasing it into production so that the testing team can “assure” quality. If something goes wrong, the fault lies with the assurance gone wrong. From a delivery team perspective, we put the fault at the feet of the testing team, and after all that is why we paid for their service. In case there is another vendor involved for testing we could even blame the cost of poor quality on that vendor as “clearly” they should have found it during testing. So the cost of testing could be seen as the insurance premium paid by the delivery team so that when something goes wrong we can call on that insurance to put the cost at the feet of the testers.

The world has materially changed and IT delivery works with different paradigms now. I had some fascinating conversations over the last few months that showed me how testing and quality engineering are following a different paradigm now. It is economic risk management. Let me explain what I have learned.

Let’s start with the scenario. Assume we can identify the whole scope of an application, all the features, all the logical paths through the application, and all the customer/product/process permutation. That will give us 100% of the theoretically testable scope. If you accept 100% risk, you could go live without any testing…well you wouldn’t, but theoretically that would just be maximum risk acceptance. So how do you define the right level of testing then?

Here is the thought process: 

 Let’s assume we can align all the testable scope from most important to least important (by business criticality, customer value, or whatever else is the right criteria for your company). You can then define two critical points on that continuum:

  1. The point at which you have done enough testing that the risk of go-live is acceptable
  2. The point at which the cost of further testing is higher than the value associated with the remaining tests -> e.g. the point at which we should stop testing

You might notice that the two points are not the same, which means testing does not have to stop at Go-Live. The idea of continuing testing after the release is not new. Some companies have always used production for some testing that wasn’t possible in non-production environments.

What we are saying here is that you might decide to go live due to the business benefits, but that you will continue testing in parallel with stakeholders using the application in production to flush out lower-risk problems. The idea is that if a user beats us to find a problem that the impact is acceptable. Less frequently used features or lower-value features might qualify here for post-go-live testing, you wouldn’t want to hold back high-value features from going live if an untested feature is unlikely to be used in production before we get to test it.

There are two thoughts here that would have been heresy when I started testing nearly 20 years ago:

  • You can plan to go live before you complete testing
  • Testing continues after go-live to find defects faster than the user does

In Agile training, we speak about the need to prioritise defect work with features; making the choice of allowing a defect to exist longer if I have higher value features to be delivered. The way I describe testing in this blog is following the same thought process. We want to maximise value and economically evaluate our options.

I am excited by this economic testing framework and encouraged by some of the conversations I have been able to have in support of this idea. People have inherently used similar ideas when they ran behind on testing or when fix-forward decisions were made. Formalising this thought process and consciously planning with it will create additional value for organisations.

The implications are clear to me: Testing has moved from being a phase to being an ongoing endeavor. Given the limited capacity, prioritisation calls need to be made ongoingly on where to invest the capacity; in the pre-release phase or in the post-release phase, on further testing existing features or on testing newly created one, more testing on high-value features or doing a first round of testing on lower-value features. An economic model based on risk should drive this, not the need to achieve a certain test coverage across the board. What testing strategy we follow should be conscious business decision, not an IT delivery decision.

I am sure you will hear more about this modern take on testing and risk management in the future.

DOES 22 – Baby, it’s back – A Summary

And it’s over!

The first post-pandemic DevOps Enterprise Summit. I am sorry I doubted you DOES, thinking now that I nearly gave it a pass feels crazy given how much positive energy, great ideas, and inspiring stories I was able to take away from the conference.

So here come my reflections on what I saw this year:

Themes:

  • Unavoidably Covid, the economic outlook, and the situation in Ukraine were a constant backdrop to the stories and shows how organisations weathered the storm
  • The hallway track and exhibition hall was full of discussions about Developer Experience, Developer platforms, and Developer productivity (and some consensus that you cannot measure productivity easily, but you can measure the reduction of toil as a proxy)
  • On stage the themes were:
    1. Bringing business and IT together
    2. Automated governance and making audits less painful
    3. The support of expert teams, platform teams, (and whatever else you call it) that can help your DevOps journey

Top 3 talks:

Transformed! A Musical DevOps Journey with Forrest Brazeal

 What a performance – I cannot wait to share the video once available. The songs were funny and heartfelt, the appeal in the middle was close to my heart and he had the whole audience captive for his session. Thanks, Forrest, I hope you go viral!
Check it out here.

adidas: How Software Engineering Boosts the Next Adidas Commerce Transformation

A smashing video that showed the Tech behind Adidas, a presentation showing how Adidas innovated at a fast pace and the incredible scale Fernando deals with. A couple of interesting takeaways:

  • Each of his technology hubs owns a technical capability, which made each hub a priority in its own right and less focus back on HQ
  • The incredible economic impact outages can have (1M per hour at peak)
  • What incredible fun it can be to solve problems when you are close to your customers

Simplification and Slowification: Creating Conditions for the Enterprise’s Distributed Intelligence to Achieve Unparalleled Results

A powerful analogy about painting rooms and moving furniture. The ideas of simplification and slowification will need to sit with me for a little while before I fully get my head around them. Some great sound bits – “Winners harvest the intellectual horsepower of their organisation”. As a leader you need to shape the time and space of your people, so that can solve problems when they are not under high pressure and can operate safely later. You can read a bit more about the analogy here

A quick flyby of some of the other amazing talks:

Do I Need to Insource to Achieve My DevOps Goals?

An appeal to figure out how to work with Global SIs given they make up a large part of the industry. Not very surprising that I felt spoken to. And a great re-iteration from Gene upfront that the DevOps survey found that it is not outsourcing that is a problem for DevOps, it’s functional outsourcing that creates barriers that is the problem.

Airbnb’s Nimble Evolution: Chapter One How Airbnb is Evolving Its Ways of Working

A nice view of the early steps, the first chapter, of Airbnb’s evolution with a key takeaway being the problem of having too much WIP to be successful

2×2: Organizing for DevOps Success in Large Bureaucracy

The 2×2 matrix of unity of command vs proximity to customer outcomes will become a staple in my client discussions. So simple, so powerful

Dear Security, Compliance, and Auditors, We’re Sorry. Love, DevOps.

A really interesting view under the hood of automated governance. Lots of interesting things to take away and try out.

Creating the Value Flywheel Effect

 Lots to digest in this one about cloud migrations and serverless – I also liked the term “legacy cloud” for apps migrated but not transformed. 

Automated Governance Changed What We Believed In

This one hit close – the story about automated governance showing how a team didn’t follow a corporate standard which could have been cause for dismissal gave me a perspective that I hadn’t had before. Really worth considering when speaking about psychological safety.

Operational Excellence in April Fools’ Pranks: Being Funny Is Serious Work!

Fun presentation by Tom and then it ended up also being insightful about the need and power of good operational controls and feature flags. A talk worth a rewatch.

Human Complications (and Complicated Humans) in Digital Transformation

Very insightful talk about success in digital transformations. I especially liked the idea that products are the software-tised versions of the business. Honesty and alignment as core tenants for success resonated with me as well.

CSG: CSG’s Ongoing Transformation Story

The quote “Leadership looks a lot like loving people” was such an interesting aspect that I am still thinking about how appropriate or inappropriate the analogy is. 

Disney: Disney Global SRE – Creating Digital Magic

Another Jason Cox masterpiece with plenty of Star Wars imagery – I still think Jedi Engineering Training Academy is super cool. How do I get a certification from that academy?!? The talk was powerful as it gave good advice on how a shared service can be successful: In summary: Actually help your customers 

Lessons Learned in Engineering Leadership

A humbling story from Michael Winslow who made the journey back from management to individual contributor. A story told with lots of feeling. All the best for your next step Michael.

And then the last talk blew it all out of the water…

Right Fit, Wrong Fit – Why How You Work Matters More Than What You Value by Andre Martin

He spoke about the importance of fit that a person needs to have with the company they work in. People referred to it as a TED talk and that is absolutely correct. I cannot wait for his book to come out next year. The advice to really get to know yourself and your values (not the values you want to have, but the ones you do) and how incredibly impactful the right fit is inspired me. I have to share this talk with many people I know. Thank you Andre.

I could have listed a dozen more here – It was a great event and the whole DevOps community was beaming just for being back together. And the talks gave us all things to take back to our organisations and we made new friends along the way. Looking forward to DevOps Enterprise Summit 2023 already.

Check out the videos of all these talks – you can find the ITRevolution video library here

DevOps Heresy – Breaking Rules Consciously

This week is the big family re-union of the DevOps family at the DevOps Enterprise Summit 2022 in Las Vegas. I was lucky enough to be invited back as a speaker and was preparing my talk about DevOps Heresy – things I do that break the “common” rules of DevOps when I realised I cannot squeeze it all into my allocated time.

In my talk I speak about things that might surprise you about some of my teams. I would have called these DevOps Heresy myself a few years ago ( … and I am looking over my shoulder wondering whether the “DevOps police” is turning up any minute)

– The team did not fail pipelines with failed tests

– The team did deploy code that had failed security scans

– The team chose manual cloud deployments over automation

– The team moved from Full Stack to a federated model

I have learned so much from working in large complex enterprises over the last few years that I wanted to share it all in one talk – and of course I realise there is not enough time to cover it all.

Hence I feel I have no choice but to write a series of blog posts that go deeper into the world of DevOps heresy and why I think context is king.

My talk finishes with the unsolved riddle of how to find the right balance between pragmatic progress and evangelism that pushes the boundary. This is the real trick of any large-scale transformation I find. I will remain a pragmatic evangelist of DevOps. Looking forward to sharing a few steps of my recent journey with the community.

It’s 2022 and I am trying to crawl out from under a rock

Phew – finally I am sitting down to write this post. Nearly 2 years since I wrote my last blog post and I have missed this. It’s my birthday and this is as good a day as any to re-energise my blogging.

Let’s be honest the last 2 years were hard for everyone, I didn’t feel like reading or writing about DevOps when my podcast listening list was full of Corona podcasts instead of DevOps and Agile, when I was following the case and vaccination numbers like I normally follow football scores from the Bundesliga and when the idea of speaking at a conference meant to sit at home in front of a camera. Now I think the tide has turned and I am honestly ready to crawl out from under the Corona rock.

Both in my professional and private life there was lots of interesting stuff happening and I came through the last 2 years with my sanity and health mostly intact (well trying to work from home as two working parents with no childcare meant sanity got a new definition around my house…)

So what to expect from me – well I set myself the goal to post once a month to ease back into it. My topics will include the usual Agile and DevOps topics but given I spent more time with cloud migrations you might see a few of those posts too. And indeed the next post will be my reflections on the state of cloud migrations, which I will hopefully get out next week.

I am looking forward to reengage with the DevOps and Agile community and to hopefully meet many of you again in person at a conference in the hallway track or just in a bar over a scotch. Here is to hoping 2022 is a step back to writing and putting my thinking into this blog format.

And on a sidenote I am writing a utility to optimise my solar system by using the battery in my Tesla M3 in the best way – I started to write up what I am doing in a separate blog. If you follow me on twitter I will share what I am doing once I have put it in a shareable format.

Agile needs a more agile mindset – and the end of the Method Wars

Okay this blog post will be less pragmatic than my usual ones, but I need to get this off my chest. Why do I encounter so many Agilists that are less agile in mind than many of the PMs I know who work on Waterfall projects? Does any of this sound familiar to you and drive you as mad as it does me:

  • “This is not true Agile”
  • “This is not called a User Story it’s a PBI (or vice versa)”
  • “Method X is not Agile, Method Y is much more agile”
  • “Sprints need to be x days long and not any longer”

I have even heard that some methodologies prevent people with their highest trainer certifications from being certified in other methods. I cannot confirm this, but if true it is madness in my view.

This reminds me of all the passion and silliness of this scene from Monthy Python’s Life of Brian:

Let’s make one thing clear: There is no award for being Agile according to any one Agile method. This level of dogma is completely unnecessary and is taking too much energy in Agile discussions.

What we are trying to do is deliver better solutions faster. And all the methods and tools out there are for us to combine to achieve that. Of course when you are not mature you should follow one of the methods more strictly to get used to the new way of working and then later combine it with other elements (Shu Ha Ri is a common concept we use to explain this). We should focus on that. I appreciate that it is often harder to measure outcomes than compliance to a specific method, but it’s worth it.

So if you encounter an Agile coach that is dogmatic or only follows one method and speaks disrespectful of all others, be careful. He might be able to help you a few steps of the journey, but you should look for someone more open minded to help you in the long term.

There are a lot of good talks/articles out there that challenge our “folklore” of software delivery, I find it extremely interesting to read about people who diverge from the “scripture” and do research to prove or disprove what we think we know. A couple of examples:

If you know of more examples, let me know. I love it when concepts get challenged.

Challenging what I know about Agile

I have been working in agile ways for many years now and many things have become a mantra for every new engagement: we have teams of 7 +/- 2 people, doing 2 weeks sprints and keep the teams persistent for the duration of the engagement. Sounds right, doesn’t it? But is it?

How do we know those things? Do we have science or do we just believe it based on the Agile books we have read and what people have told us? Over the last few weeks and months I have actively sought out differing opinions and research to challenge my thinking. I want to share some of this with you.

Team size

Agile teams are 5-9 people. Full stop. Right? There is some research from Rally from a while back that shows that smaller teams are having higher productivity and throughput than larger teams. Larger teams however have higher quality. So far so non-controversial. If quality is really important we want slightly larger teams and a higher percentage of quality focused people in the team. What really made me rethink was an interview that I heard with an Agile coach, where he described how having larger teams lead to better predictability of outcome. He made two arguments for that, one was quality as discussed above where lower quality leads to rework and in small teams that can really hurt predictability. The second one was the more obvious one, in small teams sickness, holidays or other events have a much larger impact and/or people might feel less able to take time off and burn-out. So with all this in mind perhaps slightly larger teams are overall better, they might be less productive but provide higher quality, are more predictable and more sustainable. Perhaps those qualities are worth the trade-off?

Persistent teams

Teams should be long-lasting so that they only have to run through the Forming-Storming-Norming-Performing cycle once and then they are good to go. Every change to the team is a disruption that causes a performance dip. So far the common wisdom. I have heard arguments for a change in thinking – the real game changer is dedication to a team. Being 100% assigned to one team at a time is more important than having teams that work together for a long period of time. Rally in their research found that dedicated people have a 2x factor of performance while long-standing teams have only 1.5x. This model will also allow for more flexibility with scarce skillset – people with those can dedicate themselves to a new team each sprint. I feel there is something that speaks for this argument but personally there is a probably a balance to be found between full persistency and frequent change, but at least we don’t have to feel bad when the context requires us to change the team setup.

Sprint/Iteration Length

Shorter sprints are better. 2 weeks sprint are the norm. I have said those sentences many many times. When I looked at the rally research it showed similar to team size that shorter sprints are more productive but longer sprints have higher quality. So we need to consider this in designing the length of the sprints. We also need to consider the maturity of technology and our transaction cost to determine the right sprint length (less automation = longer sprints). And then I heard an interview with a German start-up. They run 2 week sprints most of the time, but then introduce longer 4-6 weeks sprints sometimes and the reason they give is that 2 weeks is too short for real innovation. Dang. Have we not all felt that the tyranny of the 2 week sprints makes it sometimes hard to achieve something bigger, when we forcefully broke up stories to fit into the 2 weeks or struggled to get software releasable in the 2 week sprint. I still think that the consistency of the 2 weeks sprints makes many things easier and more plan-able (or perhaps it’s my German sense for order 😉). But at least I will think more consciously about sprint length from now on.

There you have it, three things I took for granted have been challenged and I am more open minded about these now. As always if you know of more sources where I can learn more, please point me to it. I will keep a much more open mind about these three dimensions of setting up teams and will consider alternative options going forward.

A subjective state of DevOps / Agile perspective for 2019

As I am starting back at work looking forward to an exciting 2020, I had the chance to reflect on what I have seen at my clients in 2019. The state of Agile / DevOps is so different now to a few years back, which is amazing to see for me. I thought i share my perspective.  Of course this is highly subjective as it is based on the clients I spoke to and worked with. As I have spent time in Europe, Asia, Africa and Australia and across many industries, I still think this reflection might be of interest to some of you.

Before getting into the three major trends that I have seen, I want to say that I am really encouraged that the “method wars” seem to be over. Most organisations do not care so much about the specific method they use or how pure their DevOps or Agile is to some definition and rather focus on results and progress. This is very much aligned with my personal position and made it a real pleasure working with my clients last year. There was a lot less dogma than only a few years earlier. I hope this continues in 2020.

Here are my three major areas that I have worked with my clients on:

DevOps team – I spend a lot of time this year creating solutions with DevOps teams that are as self-contained as feasible. It is surprising that we don’t yet have a common understanding on how to work with a full-stack cross functional team when you have to consider:

  • Solution delivery/development that is somewhat plan-able and often based on Scrum
  • Work with the ad-hoc nature of operations tasks based on Kanban
  • The work that is more platform related like infrastructure, CI/CD, middleware, integration
  • Transformational work or work to reduce technical debt

Getting the balance right between all these things within a team or to split it out to multiple teams has been a fascinating discussion with my clients this year. I am very much looking forward to see this at large scale in 2020. Last year this idea really became mainstream and more and more clients asked for this type of delivery – finally ;-). Here is a simple systematic picture of the associated flow of work:

How to combine Dev and Ops type workflows

DevSecOps – For me security was always included in DevOps, but boy did I underestimate the magnitude of this. In 2019 I spoke to nearly all my clients about security and the new challenges in this space. I spoke a lot with security experts in the industry and learned so much more about what is happening in this sapce. I already had an appreciation for the need to secure your applications and your infrastructure (including the DevOps tools) but learning about the magnitude with which DevOps maturity increases the consumption of open source components and the speed of consumption blew my mind. Also the new thread vectors of people placing malicious code on purpose in open source components was something I hadn’t considered before. I for one will make sure all my solutions treat security as a primary concern.

Digital Decoupling – Last but not least the idea of digital decoupling. With the new IT landscapes many organisations are faced with the challenge of becoming less reliant on Mainframes and finding ways to reconfigure their packaged software ecosystems in better ways. Data has become the new answer on how to decouple systems. Being able to work on the data layer instead of having to rely on APIs, ESBs and the likes has really opened completely new ways to address this problem. The speed and agility with which you can create new functionality in this new architecture pattern is impressive. And by investing in the new stack and growing new architectures in an efficient way we can slowly get rid of the legacy applications over time. All that while creating new functionality. Gone are the days of “like for like” upgrades or technology transformations which take months and years. And of course the new architectures are being build based on Agile and DevOps ways of working enabling our transformation.

All these three trends are not completely new in 2019 but are now well and truly center stage. I will continue in 2020 to progress them with my clients and am looking forward to share with you what I learn. A super exciting year lies ahead to fully reap the benefits of these three trends coming together.

A personal note from my side on 2019, as it was an amazing year professionally for me:
The year 2019 started with my book “DevOps for the Modern Enterprise” winning the best DevOps book of the year award and ended with my nomination for “Best DevOps practitioner”. What an amazing year. I look forward to 2020 and how I can help my clients best this year. I have a feeling that this year I will spend more time with a smaller number of organisations and get into a lot more day-to-day details. I am looking forward to that! Nothing is more motivating to me than to achieve results and see an organisation make progress towards better delivery of software-based solutions. And yes getting my hands dirty with the messiness of “heritage” a.k.a. legacy technologies 😉

I hope to meet many of you at a conference or at work to discuss our experiences and share what we have learned in our journeys so far.

Have an amazing 2020 everyone!

The modern IT architect needs to manage 3 architectures to enable Agile and DevOps ways of working

3 architecturesI have been a technology architect for a long time and have worked with many different technologies. And there is something satisfying about coming up with “the architecture solution” for a business problem. The ideal end-state that once implemented will be perfect.

Unfortunately I had to come to the realization that this is not true. There is no end-state architecture anymore and there never was. All those diagrams I drew with the name end-state in it – they are all obsolete by now.

Knowing that architecture will continue to evolve (just look at the evolution of the architecture of Amazon (or many other internet companies) over the years) means as architects we need to think differently about architecture. We need to build architectures that before even implemented are already considering how parts will be replaced in the future. No stone will remain unturned over time no matter how good they seem at the moment. So rather than spending time on defining the end-state we need to spend more time understanding and defining the right principles of architecture for our organisations and manage the evolution of the architecture – how technical debt is paid down and systems are being decoupled for the eventual replacement.

This would be difficult if we had to deal just with the architecture of business systems. Reality is that in the current IT world we have to deal with three different architectures: the business systems architecture, the IT tools architecture and the QA and Data architecture.

Let’s quickly define these:

  • Business Systems Architecture – this is usually well defined in organisations. How do your CRM, ERP and billing systems work together to achieve business outcomes
  • IT Tools architecture – this is the architecture of all your tools that make IT delivery possible: configuration management, container management, deployment, defect management, Agile lifecycle, etc
  • QA and Data architecture – how do we validate that systems are working correctly both in production and in new development and how is data flowing across systems and environments

All three of these architectures need to be managed with the same rigor and focus on continuous evolution as the business systems architecture. This will make the job of architects a little bit more complicated. At the moment I see many organisations not having architects focused on all three architectures as they are not perceived as being of similar importance.

Let me give you some examples from my past to highlight why that is foolish:

  • One of my clients was already pretty mature in their automation so that all deployments to production were fully automated. Unfortunately their deployment architecture was basically a single Jenkins server that was manually maintained. When this server was wiped out by mistake it took weeks to get the capability back to deploy to production – in the mean time very risky manual deployments had to be performed by people who had not done this in months
  • Another client of mine had built a test automation framework that was too tightly coupled so that it took a lot of effort to replace one of the components and maintenance had become so expensive that they had stopped using it – ultimately there was too much technical debt in the tests and the QA and data architecture

The answer of course is that all three architectures need to be managed by architects in similar ways (e.g. failover and availability need to be considers for IT tools and QA tools too) and that the principles of decoupling and continuous evolution need to be aspects of all three.

The architect function is one that will see a lot of change as we come to terms with managing three interconnected architectures and the evolving nature of architecture. But I think it will make the job more interesting and will allow architectures to climb down from the proverbial ivory tower to engage with the Agile delivery teams in a more meaningful way.

DevOps Enterprise Summit Las Vegas 2018 – The best yet?

DOES18Its already over again – the annual get together of the brightest DevOps minds (well the brightest who could make it to Vegas). And in this instance I want to make sure that what happens in Vegas, does not stay in Vegas by sharing my highlights with all of you. It was a great event with a slightly higher focus on operations than last time.

The four trends that I picked up on:

  • Self service operations are considered an good answer to the “DevOps Team” problem
  • The prevalence of Dunning Kruger (Link) when it comes to self-assessments -> We are “DevOps”, we use the “cloud”,…
  • Minimum Viable Compliance as a new term
  • DevOps for AI – I did not see much AI for DevOps yet, perhaps next time

This year the conference focused much more on operations which is great, for next year I hope that we bring in some of the end-to-end business stories. How have we used DevOps practices to drive business – thinks like instrumentalization of software features to understand the business impact.

The top 3 talks

Andrew Clay Shafer

Andrew spoke in his typical eccentric style (with tie on his head) about Digital Transformations and doing DevOps. He made it clear that there is no real end to this transformation and compared it with getting fit (something he took on successfully since the beginning of the year). All the external help you can get will not make you fit, it’s the work you put in yourself. The same is true for this DevOps/Digital transformation. He also made a good point that some message can be dangerous if they are given before the recipients are ready.

J Paul Reed

I admit that I went into “5 dirty words about CI” expecting a talk about Continuous Integration like many others , something John humorously took on in the beginning. The talk focused on Continuous Improvement instead and stood out for me. Key learnings:

  • Root causes are a social construct for the point where we stop looking further, more appropriately we should call those “proximate causes”
  • A great story from Amazon, who used an outage caused by “human error” to look for the weaknesses in the systems rather than finding fault with the person who made a mistake
  • That incidents are not deterministic – there are many parallel universes where the incident might not have happened in the same circumstances, our systems are too complex to be deterministic. The Swiss Cheese model to analyse incidents was a great take away for me.
  • Human error is not the cause, it’s the effect. It’s the start of the investigation.

Damon Edwards

Okay, okay he usually is in my top list of talks because I like his style and approach. This time he told a great case study of how all the new fancy technologies and techniques did not prevent the opps problems. Which was a) funny and b) educational to move us away from all the techno-optimism. He then described the self-service operations model which I prefer myself as well, so this was good to see discussed, which is also called out in the latest puppet state of DevOps report.

 

Some other nuggets from other talks

Courtney Kissler

  • First mention of “Minimum Viable Compliance”
  • I loved the phrases “Geriatric stack” and “PTSD caused by legacy apps”
  • Great story on how compliance can calculate negative ROI to justify investment

Scott Prugh

  • Great case study where he showed the results over the years
  • Showing how Agile and DevOps work together to achieve huge results
  • Aligning teams to Value streams not projects and even using the platform as a product with product owner
  • Some interesting aspects on lean portfolio management that is run through several “tanks” like sharktank
  • I loved the metric – “% of things releases outside of release cycle”

Jeffrey Snover

  • Super inspiring to hear from someone who stood up for what he believed in and that careers can be a bit like snakes and ladders (it took him 5 years to come back from a demotion)
  • Insightful to hear about organisations and/or leaders having a list of people who are moving the company forward and of course those are the people that get well rewarded
  • How transitions transformations can push your career – introducing the terms stair job vs elevator jobs
  • Loved the point that when you are really good at something that doesn’t matter anymore that it will be bad for you (either as organisation or in your career) – great example from DEC which was excellent at something that didn’t matter anymore a and missed the move from vertical integrated to horizontal systems
  • He made a point that he believes the pendelum is swinging back to vertical integrated (like Azure and AWS)
  • “Build what differentiates you, buy what doesn’t”

Thorsten Volk

  • Reiterated the point I have heard a few times now that AI is still similar to the 90s just more accessible and powerful
  • Understand what AI and human understanding is good for and when to use what
  • Start with a narrow problem and extend it once you have a useful answer
  • Treat AI as code – Parameters, training set, data transformation pipeline, etc
  • Use public data – there is heaps that you can use to teach your algorithms
  • AI is a virtual reality as you can only see what is in the data and that data can be biased

Mik Kersten

  • He introduced his new framework to help with the transition from project to product which is described in his new book, which was available at the conference for the first time (I will review it in a blog post in due course)

Rob England

  • How some of the new material in the DevOps world has forgotten the old and sometimes is even reinventing it
  • Interesting anecdote that the SRE book comes to the conclusion that key metric is unplanned downtime – something the ITSM community has already know
  • How DevOps has not covered everything that ITSM covered like user training/Desktop management,… – there is some benefit in review the more rigorous material from ITIL
  • Gave us hope that ITIL 4 will be more relevant and easier to consume vs ITIL 3 being a bit of mixed bag

Jez Humble, Nicole Forsgren

  • Only 22% of people who say that they use the cloud are following the 5 characteristics of NIST – most hands went down at “OnDemand Self Service”

Cornelia Davies

  • Difference between functional and imperative programming
  • Why functional programming allows us to do get systems to do more for us and are less error prone because there is no state embedded
  • The term “Repave environments” – refreshing every part of the environment which we should do regularly
  • Introduced the concept of “Sidecars” – a container next to another container in a Kubernetes pod that deals with cross cutting concerns like security

 

Another brilliant conference is over and I am already looking forward to next year.