I have been a technology architect for a long time and have worked with many different technologies. And there is something satisfying about coming up with “the architecture solution” for a business problem. The ideal end-state that once implemented will be perfect.
Unfortunately I had to come to the realization that this is not true. There is no end-state architecture anymore and there never was. All those diagrams I drew with the name end-state in it – they are all obsolete by now.
Knowing that architecture will continue to evolve (just look at the evolution of the architecture of Amazon (or many other internet companies) over the years) means as architects we need to think differently about architecture. We need to build architectures that before even implemented are already considering how parts will be replaced in the future. No stone will remain unturned over time no matter how good they seem at the moment. So rather than spending time on defining the end-state we need to spend more time understanding and defining the right principles of architecture for our organisations and manage the evolution of the architecture – how technical debt is paid down and systems are being decoupled for the eventual replacement.
This would be difficult if we had to deal just with the architecture of business systems. Reality is that in the current IT world we have to deal with three different architectures: the business systems architecture, the IT tools architecture and the QA and Data architecture.
Let’s quickly define these:
- Business Systems Architecture – this is usually well defined in organisations. How do your CRM, ERP and billing systems work together to achieve business outcomes
- IT Tools architecture – this is the architecture of all your tools that make IT delivery possible: configuration management, container management, deployment, defect management, Agile lifecycle, etc
- QA and Data architecture – how do we validate that systems are working correctly both in production and in new development and how is data flowing across systems and environments
All three of these architectures need to be managed with the same rigor and focus on continuous evolution as the business systems architecture. This will make the job of architects a little bit more complicated. At the moment I see many organisations not having architects focused on all three architectures as they are not perceived as being of similar importance.
Let me give you some examples from my past to highlight why that is foolish:
- One of my clients was already pretty mature in their automation so that all deployments to production were fully automated. Unfortunately their deployment architecture was basically a single Jenkins server that was manually maintained. When this server was wiped out by mistake it took weeks to get the capability back to deploy to production – in the mean time very risky manual deployments had to be performed by people who had not done this in months
- Another client of mine had built a test automation framework that was too tightly coupled so that it took a lot of effort to replace one of the components and maintenance had become so expensive that they had stopped using it – ultimately there was too much technical debt in the tests and the QA and data architecture
The answer of course is that all three architectures need to be managed by architects in similar ways (e.g. failover and availability need to be considers for IT tools and QA tools too) and that the principles of decoupling and continuous evolution need to be aspects of all three.
The architect function is one that will see a lot of change as we come to terms with managing three interconnected architectures and the evolving nature of architecture. But I think it will make the job more interesting and will allow architectures to climb down from the proverbial ivory tower to engage with the Agile delivery teams in a more meaningful way.
Its already over again – the annual get together of the brightest DevOps minds (well the brightest who could make it to Vegas). And in this instance I want to make sure that what happens in Vegas, does not stay in Vegas by sharing my highlights with all of you. It was a great event with a slightly higher focus on operations than last time.
The four trends that I picked up on:
- Self service operations are considered an good answer to the “DevOps Team” problem
- The prevalence of Dunning Kruger (Link) when it comes to self-assessments -> We are “DevOps”, we use the “cloud”,…
- Minimum Viable Compliance as a new term
- DevOps for AI – I did not see much AI for DevOps yet, perhaps next time
This year the conference focused much more on operations which is great, for next year I hope that we bring in some of the end-to-end business stories. How have we used DevOps practices to drive business – thinks like instrumentalization of software features to understand the business impact.
The top 3 talks
Andrew Clay Shafer
Andrew spoke in his typical eccentric style (with tie on his head) about Digital Transformations and doing DevOps. He made it clear that there is no real end to this transformation and compared it with getting fit (something he took on successfully since the beginning of the year). All the external help you can get will not make you fit, it’s the work you put in yourself. The same is true for this DevOps/Digital transformation. He also made a good point that some message can be dangerous if they are given before the recipients are ready.
J Paul Reed
I admit that I went into “5 dirty words about CI” expecting a talk about Continuous Integration like many others , something John humorously took on in the beginning. The talk focused on Continuous Improvement instead and stood out for me. Key learnings:
- Root causes are a social construct for the point where we stop looking further, more appropriately we should call those “proximate causes”
- A great story from Amazon, who used an outage caused by “human error” to look for the weaknesses in the systems rather than finding fault with the person who made a mistake
- That incidents are not deterministic – there are many parallel universes where the incident might not have happened in the same circumstances, our systems are too complex to be deterministic. The Swiss Cheese model to analyse incidents was a great take away for me.
- Human error is not the cause, it’s the effect. It’s the start of the investigation.
Okay, okay he usually is in my top list of talks because I like his style and approach. This time he told a great case study of how all the new fancy technologies and techniques did not prevent the opps problems. Which was a) funny and b) educational to move us away from all the techno-optimism. He then described the self-service operations model which I prefer myself as well, so this was good to see discussed, which is also called out in the latest puppet state of DevOps report.
Some other nuggets from other talks
- First mention of “Minimum Viable Compliance”
- I loved the phrases “Geriatric stack” and “PTSD caused by legacy apps”
- Great story on how compliance can calculate negative ROI to justify investment
- Great case study where he showed the results over the years
- Showing how Agile and DevOps work together to achieve huge results
- Aligning teams to Value streams not projects and even using the platform as a product with product owner
- Some interesting aspects on lean portfolio management that is run through several “tanks” like sharktank
- I loved the metric – “% of things releases outside of release cycle”
- Super inspiring to hear from someone who stood up for what he believed in and that careers can be a bit like snakes and ladders (it took him 5 years to come back from a demotion)
- Insightful to hear about organisations and/or leaders having a list of people who are moving the company forward and of course those are the people that get well rewarded
- How transitions transformations can push your career – introducing the terms stair job vs elevator jobs
- Loved the point that when you are really good at something that doesn’t matter anymore that it will be bad for you (either as organisation or in your career) – great example from DEC which was excellent at something that didn’t matter anymore a and missed the move from vertical integrated to horizontal systems
- He made a point that he believes the pendelum is swinging back to vertical integrated (like Azure and AWS)
- “Build what differentiates you, buy what doesn’t”
- Reiterated the point I have heard a few times now that AI is still similar to the 90s just more accessible and powerful
- Understand what AI and human understanding is good for and when to use what
- Start with a narrow problem and extend it once you have a useful answer
- Treat AI as code – Parameters, training set, data transformation pipeline, etc
- Use public data – there is heaps that you can use to teach your algorithms
- AI is a virtual reality as you can only see what is in the data and that data can be biased
- He introduced his new framework to help with the transition from project to product which is described in his new book, which was available at the conference for the first time (I will review it in a blog post in due course)
- How some of the new material in the DevOps world has forgotten the old and sometimes is even reinventing it
- Interesting anecdote that the SRE book comes to the conclusion that key metric is unplanned downtime – something the ITSM community has already know
- How DevOps has not covered everything that ITSM covered like user training/Desktop management,… – there is some benefit in review the more rigorous material from ITIL
- Gave us hope that ITIL 4 will be more relevant and easier to consume vs ITIL 3 being a bit of mixed bag
Jez Humble, Nicole Forsgren
- Only 22% of people who say that they use the cloud are following the 5 characteristics of NIST – most hands went down at “OnDemand Self Service”
- Difference between functional and imperative programming
- Why functional programming allows us to do get systems to do more for us and are less error prone because there is no state embedded
- The term “Repave environments” – refreshing every part of the environment which we should do regularly
- Introduced the concept of “Sidecars” – a container next to another container in a Kubernetes pod that deals with cross cutting concerns like security
Another brilliant conference is over and I am already looking forward to next year.
As Agilists we keep using the Agile manifesto and are pretty much beyond questioning the exact words on the page. One key one is:
Working Software over comprehensive documentation
And over the years we had the arguments that this does not mean that there is no documentation, what exactly the definition of working software is and many other aspects of this. Yet I failed to pick up on a very important nuance for all those years.
The other day I was running an Agile training and someone with less of an IT background called out to me that she disagrees with the statement “working software over comprehensive documentation”. I was shocked initially until she explained that working software means terribly little when no one uses it (for example due to poor documentation or change communication) or if the software does not solve a business problem.
Wow – how had I not noticed this clear miss before. Of course, we are not about just building software and the Agile manifesto came from people who were thinking about creating software in a better way. But more than decade after I started to run Agile trainings, why had I never noticed this clear gap in the manifesto for our modern context.
A few weeks earlier I was at an Agile conference in Germany and someone spoke about Agility rather than Agile and pointed out how much of the language in the Agile principles behind the manifest are software specific (e.g. “The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.”) How have we not updated this material to be more relevant by changing all the wording to be inclusive of business and other parts of the organisation that need to enable success? See below for references to software development in more than half of the principles.
Of course, the manifesto has its relevance as historic document, but whenever we talk about it, it is worthwhile calling out that we do have a broader understanding of this now which goes way beyond IT and software development. I for one will call this out more clearly whenever I run training going forward. We are using Agile to solve business problems and creating better software is only a small part of the solution. Thank you Jane for calling out a blind spot that I have had for years!
UPDATE: Thanks to Phil i found this tweet which discusses a solution:
I always loved this quote: “Nothing is more dangerous than using yesterday’s logic for today’s problems” which shows you that you just cannot afford to get lazy and do the same thing again and again. This causes larger problems when you scale it up. Gary Hamel summarizes the problem our organisations face as follows: “Right now, your company has 21st century Internet enabled business processes, Mid 20th century management processes all built atop 19th century management principles.”
One of the main reasons for me to write “DevOps for the Modern Enterprise” was to help address this mismatch between the work we need to do, creative IT based problem solving, and the management mindset many managers still have, that of IT being managed just like manufacturing.
I like to use the term mental model to describe what having the wrong mindset means for the every day job of managers and other executives. Let’s take a very practical example to show you how your mental model shapes your view of reality. Look at the vase on the picture below. What do you see?
Depending on how your brain has been shaped up to this day, you will see different things on the vase. Children predominately see 9 dolphins (see further below to help you see them). I guess that you saw something different, didn’t you? What does that say about your mental model of reality and your preferences 😉 What this exercise hopefully shows you is that each persons view on reality is not exactly the same and that the mental model you use makes an important difference in how you perceive reality and act.
Perceiving IT as being similar to manufacturing leads to management processes that are inappropriate, you are looking for productivity measures where there are none (more about that here), you expect people to be replaceable resources, you think that fixing the process will fix the end-product and that you can upfront plan for projects. Pretty much all of those have been shows to be incorrect.
As a starting exercise for changing your mental model, I recommend watching Dan Pink’s video on motivation (Watch it here). I leverage his idea extensively in my book and think it is a perfect match for Agile delivery where we provide purpose by providing the agile team with the context of the problem they are solving, we allow them to achieve mastery through quick feedback cycles and we created cross-functional teams that are reasonably autonomous. Once you understand Dan Pink’s mental model you can easily diagnose some of the common problems with Agile projects that don’t provide those three motivators.
This shift in mental model is exciting stuff and goes much further in areas of operations and working with vendors/partners, you can read more about it in my book. For now I hope I was able to motivate you to look further into the topic and for you to try to be more conscious of your own mental model. It is worth challenging the model you have and perhaps you are then able to see those dolphins too 😉
A few weeks ago I was in group of people and someone made a statement that application management (AM) is commodity and most people in the group agreed. I didn’t say anything but I think application management is one of the most exciting places to be today. There are so many technology trends that apply to AM: Agile, DevOps, Artificial intelligence, etc.
When I say application management I mean some level of operations (e.g. level 3 support and code fixes) and some level of application development/maintenance (smaller changes that don’t require projects). The nature of those activities are that they are frequent and smaller, which means we have a lot of opportunity to learn and improve. It’s the ideal testbed to showcase how successful modern engineering practices can be assuming the volume of work is large enough to justify investment.
Let’s look at two dimensions of this: the code changes and the platform services
The product team making the actual code changes should have responsibilities for making changes for both new scope and production defects. This team needs to reserve some capacity for urgent production defects, but the bulk of production defects get prioritised against new scope and are just forming part of normal delivery. This is possible because the release frequency is high (at least monthly) and the team can leverage platform services like CI/CD, feature flags, automated environment provisioning and monitoring services. The team collaborates with the platform services team in many ways and focuses on the application they own. Both Kanban and Scrum are possible methodologies this team can use, making this a great Agile working space.
The platform team provides the services which make life easier for the product teams. They also provide the first levels of support in a centralised fashion to protect the product team from too many distractions. There are two goals for this team: to provide platform services to the product team (like CI/CD, environment provisioning, monitoring) and to keep improving the customer experience for end-users. Fast ticket resolution is not the right measure for this aspect, but rather how to reduce the tickets overall. Advanced monitoring of performance, functionality and infrastructure is being used to find problems before the customer finds it. Each user created ticket is an opportunity to improve the monitoring for next time. Chatbots and self service are used to provide the fastest service for standard requests and AI/ML is learning from tickets to make more and more requests become standard and to route tickets to the best team/person. Analytics on all aspects of the platform allows to find patterns and reconcile them against issues. The platform team collaborates with the product team to make sure the application related services keep being improved. Over time this team becomes a small team of automation experts rather than a large team of operators for manual tickets.
Seriously if you look at the above, how are you not super excited about this space?If you do this right you get to use a lot of super modern engineering practices to improve the delivery capability of your organisation. The amount of money/capacity that frees up once you have implemented the first improvements can then be used to make more and more meaningful changes to the products and services. You can do this with a focus on the creative tasks rather than the fire fighting and repetitive manual work that consumed application management teams in the past.
I am working with a couple of organisations now that are as excited as I am about this space and have a similar vision. I am very much looking forward to see what results we will achieve together.
A few weeks ago I published a blog post on simple DevOps questions you can ask your team to find out how mature your adoption of DevOps practices are (you find it here). There is a case to be made that you should have a similar question for your Agile adoption. The problem here is that there are so many different flavors of Agile and that not one is better than the others (no matter what all those zealots out there are saying). The large amount of variations makes it hard to analyse the maturity of Agile unless you understand the context of the organisation and can evaluate the delivery success of the team. But I found a question that helps me differentiate between the “false” or “hybrid” Agile flavors and “real” Agile. I call it my minimum definition of Agile for it to be something I can engage with and think that it can be successful. Here we go, Mirco’s minimum Agile definition:
“One Team, produces a working and tested version of software within each strict timebox of equal length.”
You might think this is a really low bar for an Agile definition, but it rules out a lot of the work that I see being done under the name “Agile”. It rules out all of the following patterns that are suboptimal if not dangerously risky for delivery success:
- Separate design, build and/or test teams – often this is a remnant of existing vendor relationships or strong “fiefdoms” and the Agile change has not been able to break down the barrier. For me this is an absolute showstopper for Agile success. Pause and see how you can enable a joint team.
- Design sprints, followed by build sprints, followed by test sprints – Well let’s call this what it is, Waterfall delivery 😉
- Teams being changed and swapped – Many of the Agile benefits only manifest once the team has been working together for a while, if you need to keep changing people or ramp-up or down then there is something wrong with the work preparation part
- Agile teams building lots of stuff but are unable to validate it end-to-end – This is perhaps an already more advanced problem but I keep seeing it a lot these days. Agile teams deliver software but cannot test it sufficiently and while the velocity is high, once the product gets end-to-end tested it falls apart. There are three root causes that I commonly see: unavailability of test environments, lack of maturity in DevOps practices and lack of up-front-planning of end-to-end scenarios.
- Sprints that have different lengths or are changed at last-minute – Agile is more successful because it is more rigorous and forces you to address problems early and often. If you weaken the conditions you are stepping on a slippery slope that will likely impact you more and more. Step off the slope and fix the problems rather than changing the sprint timebox
I have been happy with this definition for now as it weeds out the most common bad patterns, perhaps it helps you too. Let me know if you have your own question or definition that you use.
Over last couple of months I have been speaking to project teams and organisations that are undergoing some major technology transformations and which have set out on this course in traditional more or less waterfall approaches. Changing course during such a transformation is risky and any changes are usually more of a smaller nature as the risk appetite is low when so much money is on the line. I understand, while I personally think that Agile is less risky in any case, the organisational maturity with Agile and the required change energy are probably preventive of making a change in-flight.
But here is the thing, once you get to the end of the current transformation your whole delivery process is tuned for the big change that you are currently undergoing. If you use the same governance approach and delivery method for the smaller changes that come after the transformation you will be really inefficient. You will wish you had used the transformation to not only set you up with a new technology but also with a delivery mechanism that supports you effectively after the transformation is over when change is smaller and more frequent.
This is where a bit of planning ahead can go a long way. If you realise the above you can use the time while your transformation is still under way to prepare yourself for post-transformation agile delivery. You can build DevOps practices into your ways of working, because they support waterfall delivery as well as Agile delivery. All the automation and process improvements will make the transformation effort less risky and the cultural shift can start to take momentum through changed behaviour. If you have a staged go-live over multiple releases you can start to embed Agile into your production support and maintenance processes so that your organisation starts to learn about Agile methods of working.
In my book “DevOps for the Modern Enterprise” I talk about transaction costs in IT and this is another case where this concept is helpful to explain the situation. If your transaction cost for a release (all the efforts for regression testing, deployments, release planning, go-live support etc.) is 100 units for your transformation which is a large development effort of 10000 units. Then using the same processes will still cost you close to 100 units for smaller changes post transformation (let’s say 1000 units). This will make delivery of small changes really inefficient and might start to bundle them up again into larger less frequent releases. What you should do is to take a close hard look at all the transaction costs and invest during the transformation to reduce them so that you get yourself ready for the time after. Otherwise the post-transformation blues is going to come quickly and you will soon see yourself in the next transformation cycle to improve the delivery process.
Another reason to invest during the transformation is that once there is less work to be done on the functional side there is probably also less money around to make the required investment in changing the way you work and the automation and tooling that is required to support it. It is much easier to justify the bit of extra investment while the transformation is still under way and use the attention of the leadership team and the change energy already in the company during transformation to set yourself up for success post transformation. Don’t let a perfectly fine transformation go to waste for your Agile change effort.