| A | B | C | D | E | F | G | H | |
|---|---|---|---|---|---|---|---|---|
1 | Cloud Native Maturity Model | |||||||
2 | ||||||||
3 | Level 1 | Level 2 | Level 3 | Level 4 | Level 5 | |||
4 | Prologue | Build You have a baseline cloud native implementation in place and are in pre-production. | Operate The cloud native foundation is established and you are moving to production. | Scale Your competency is growing and you are defining processes for scale. | Improve You are improving security, policy and governance across your environment. | Optimize You are revisiting decisions made earlier and monitoring applications and infrastructure for optimization. | ||
8 | People Overview | The team/people are new to the technology with basic technical understanding. Some pre-existing qualifications exist. Cloud native framework is driving business and technical goals. | The team/people are actively invested in training and skills. Small groups of SMEs are appearing. Expertise starting to show. You may be hiring. The foundation has been established. DevOps starting to appear - inclusion of cloud skill engineers and developer groups brining platform skills. (Initial inclusion of cloud-skilled engineers in Developer groups, bringing platform skills to developers) | Competency growing. You are getting commitment from dev, ops and security. Formalizing expertise around center of excellence. | Shifting competency to development team. The development team is building containers. | Team is skilled. You have DevOps and DevSecOps working. You've reached maturity so team is comfortable with experimenting with new technologies, sandbox trials. | ||
9 | Developers | You will be upskilling developers | Training application team in 12 factor app, microservices, cloud native patterns | Team able to operate the fundamentals of Kubernetes including: - Connecting an operator to the Kubernetes API - Understanding how to list and view resources - Performing basic actions (mechanical actions with limited understanding of how it works) Reference architecture designed | Repeatable cycle of troubleshooting implemented so changes can be done quickly and iterated on until they work. | Developers are becoming more sophisticated around: Kubernetes terminology they are expected to affect like Deployment, Daemon Set, Service or Namespace Modifying some configuration of Kubernetes resources like ConfigMap or Helm charts Troubleshooting CI/CD process Troubleshooting apps/service within Kubernetes including accessing logs and event and retrieving metrics/monitoring | Blue/Green deployment - introducing other deployment methods such as blue/green, canary. | |
10 | Developer Agility | You have organizational commitment to decentralization. | Developers may have learned about Agile Manifesto and adopted Scrum Framework without necessarily including Operations (i.e. Dev but no Ops). May be attempting to resolve external dependencies themselves, slowing down feedback, with incomplete features per sprint. | Team is comfortable with technically challenging problems. Organisational commitment to decentralisation, "Team of teams" Automated testing in use with builds, with automated deployments to some environments, particularly to test envs. | Continuous delivery for all environments, including for complex releases. Developers build compliance testing into CI and CD processes. Ops staff integrated into cross-functional teams, though not as cross-functional individuals. | Feedback is extended from application metrics through to platform and also non-functional requirements. Developers are able to quick test complex scenarios with many unknowns. Developing maturity of DevOps SMEs. Clear mapping of value streams to technological implementation. Cloud and application risks easily identified and patched quickly. Effective group cohesion. | Effective measurement of velocity, each group well focused and knows what it is doing. Group has strong ability to recover and maintain throughput tolerating individuals joining and leaving. Business decisions well informed by rich and accurate data from across all teams in the organisation, allowing for adoption of FinOps. Agile well incorporated into full organisation. | |
11 | Org change | Limited organization support. You will be in a POC phase or be focused on only one application. | Organizational change is happening. You will define project teams, create agile project groups and have quick feedback/testing loops. | As your people’s competency grows, the organization structure is now in place to support best practices. You will have formalized responsibilities. A common pattern used for this structure often embraces agile and scrum. | Cloud is now the default infrastructure for all services. The business is now requesting services from DevSecOps vs. requesting traditional servers. | Entire organization is committed to cloud native environment. Org uses and has a huge footprint. Entire org onboarded | ||
12 | Security | People are using default security settings; working in pre-production. Identifying open source security posture. Security POC of environment. Security understands what is required in cloud native workloads. | Security - identify who is responsible for Kubernetes cluster security and how it will be managed. Security team starting to get involved, training. | Training people on cloud native security requirements; actively skilling people | Involve security in design & deployment Enforcing security: organizational commitment to security with full understanding of policies and regulations both inside/outside of org | Actively developing security both internally, with community and with regulators | ||
13 | Certifications | Linux Foundation Certified System Administrator | Certified Kubernetes Administrator, Certified Kubernetes Application Developer | Kubernetes Certified Security Specialist | ||||
14 | Process Overview | Lack of process. Lack of consistency across implemations. Your process will affect your footprint. This will effect your cluster footprint topology and sizing. | Build out CI/CD maturity | Documented set of processes and capability. Documentation should be close to code and potentially machine generated | Consistent and mature process with a template-based approach | You are actively revisiting your standards. Identifying config drift and adjusting standards based on business requirements | ||
15 | Map your application requirements functional and non-functional (note: future include examples) Define scale: How do you as the organization want to scale | Production promotion of basic application | Standardization across organization - will improve onboarding | Governance model in place to support DevOps. You have guardrails in place to support agile software development. | Building design capabilities for cloud native | |||
16 | Define Git workflow | Well established with Git and CI | Investing in repeatability - do you have tools in place that are accessible to everyone - do you have Git service, confluence to save time, labor, avoid duplication | Application services library | ||||
17 | Infrastructure - Institute structured build and deployment processes that exhibit the qualities of a cloud and container native CI/CD system | Measure resource usage: container usage, CPU, memory (runtime/uptime) | Setting policy around container usage: auto-scaling policies, HPC for example | Using usage data to optimize spend. Feeding back to business cost | ||||
18 | Audit and Logs | Manual log scraping. No central logging point/SIEM. ad-hoc | Define log aggregation | Start to audit and start initial alerts; filtering noise | Auditing and alerts becomes mainstream Make mandatory across applications | Enforcing audits | ||
19 | CI/CD | If you do CI/CD, how will you take existing best practices and build upon them | Application - Institute structured build and deployment processes that exhibit the qualities of a cloud and container native CI/CD system | You are implementing a center of excellence around process | Measure release velocity and cadence | Demonstrate benefit to the organization. Clearly see increase in velocity, continuous deployment speed, see knock-on effect to business. | ||
20 | Change Control | No change control process - changes performed based on ad-hoc requests | Develop a fundamental understanding of the workflow from source control management (scm) to deployment and have access to merge/tag commits in scm to trigger deployments. | You are measuring code quality with automated tooling. Code quality is going up. seeing CI and test success frequently. | You have continuous delivery, but no continuous deployment to production - you still have a gate to production that requires operator approval. | Quality engineering (QE) capability: Quality gates in place You have continuous deployment to production with only a failed test preventing an updates being released to production. You are seeing fewer defects, hotfixes, bug fixes being released. Best practices in place. Remove human access from production in favor of service accounts: Use monitoring failures to restart or manage problematic and failing resources | ||
21 | Platform and technology lifecycle and updates, particularly security updates, need to be applied on a regular basis as vulnerable systems pose specific risks. You will likely be applying these updates by hand on an adhoc basis, or using update systems included in distributions. | Define IaC as pipeline Automation and processes associated with software release will also be extended to platforms. Lifecycle operations such as upgrades and patching, particularly CVEs and critical updates will benefit from further automation and the introduction of Infrastructure-as-code technologies. | Applying practices to application code to your IaC: writing terraform/ansible playbooks and have some CI/testing for them Apply best practices from app to infrastructure. Growing maturity of infrastructure | Review your Infrastructure as Code (IaC) to ensure it is solid: Use monitoring failures to restart or manage problematic and failing resources | ||||
22 | Feedback | Manual feedback: phone/slack | Creating a feedback loop | Automating responses: Use monitoring failures to restart or manage problematic and failing resources | ||||
23 | Security | Take action: your security journey starts here. Consider security in all aspects of implementation. Make it a first class citizen | Building security into CI process - container scanning | Automatically audit and flag misconfigurations or security issues | Security remediation - automated and/or identified automatically with remediation advice | The software supply chain is secured, with reproducible builds and software bills of materials providing insight into code and dependencies, with clear code provenance and secured release pipelines. Potential visuals of airport as part of the map. Airport standard security - ubiquitous, metal detectors, laptop out, coats off, - standard process and technologies. | ||
24 | Fat repositories with narrow targets | effective standardisation | Thin repos widely applicable | |||||
25 | Software Supply Chain | Gold standard supply chain visual of high security military | ||||||
26 | Policy Overview | Your goal is to implement policy to embrace shift left security. You will seek to implement policies and automate enforcement in CI/CD and continuously monitor against policies in production. | Limited set of documented policies in place to support services | Initial policies agreed on as standard. Policies are mostly written | Implement policy as code Coding for Policy in build, CI | Defined SLAs around policies and remediation | Based on your learnings, you will refine your policies as your organization achieves maturity, taking advantage of machine learning. | |
27 | Note for policy - it's a gradient every org has different risk, policy requirements and risk appetite. When we create this, it's gradients | Understanding compliance requirements: CIS, NIST, PCI Designing SLOs and priorities for compliance | Gaining visibility into compliance | Engaging with regulators | ||||
28 | By level 5 you will be at this level of maturity. Your mileage may vary. Show journey here. Parable | Auditing may be done manually, or through simple scripts. | Auditing capability is automated | |||||
29 | Understand your application requirements | Gathering resource metrics | Create policies based on metrics around efficiency and reliability | Customising policy based on business needs (minimise exceptions) | Contributing policies to the open source community. | |||
30 | Technology Overview | You will already have VM's on demand, some automation, baseline security components such as SIEM, firewalls, RBAC concepts, a directory of some type... and a reasonable understanding of why you want to move to becoming a Cloud Native org. | Initial adoption of Kubernetes initial experimentation. You are starting with basic tools and technology. Assessment of existing toolset and how they fit into the new landscape. What plays with cloud native/what doesn't. You have limited automation. | Incorporate monitoring and observability into workloads Bringing in Prometheus, monitoring clusters for standard metrics (cpu, RAM, storage IO, etc), starting to evaluate the need for application tracing. Don't worry about tracing if you haven't starting gathering core metrics. | In level three you have a level of buy-in across the organisation and things will scale dramatically. Standardized set of tools for certain functions | You have full control over the environment. There are no unknowns. You have built your confidence. You have enough of an organizational commitment. Rapid adoption You've crossed the chasm. | Your investment has moved on to automation in all of your functional and non-functional areas: scanning, security, testing, operators doing ops for you. You are fully automated. | |
31 | Application Patterns and Refactoring Start with a canonical microservice application if you can. That it runs, that people are familiar with it. Attempt to start with a microservice application on your cloud native journey if you can. You can try an existing or monolithic application if this makes sense, as this will flush out tooling and dependencies you'll have for your journey to cloud native. kubectl, network connectivity etc. You're beyond minikube. | By choosing the serverless route, here is a working model for the path. You may adapt this to your model. Business needs to be sitting down and reviewing microservice patterns and architecture. Looking to understand the specifics for applications. Non-functional requirements such latency, resilience, scaling and third party tooling. If you're transforming a monolith this may impose significant redesign on the application as existing needs may not have the technical resources available. Try to ensure that the knowledge stay with the code - make sure an existing developer familiar with the code participates in its migration to cloud. Minimise divergence between cloud and your existing estate. It's a commitment to move to cloud native. | You're in production, with your first APIs exposed. Develop a microservices first framework. - First choice should always be microservices approach. If you cannot, lift/shift application or don't migrate the app until later. | The business has started to think about services rather than "servers". Microservices are embraced within the organisation and are used by default where appropriate. | Microservices have become the preferred pattern for applications. The use of APIs is expanding within the organisation, and other internal systems may be exposed and consumed, and they are available to general consumption, open across the organisation via a service mesh. The organisation becomes data-centric and API-centric, and data can be more easily consumed. | All new greenfield applications are cloud native. Existing portfolio of applications will onboard to cloud native using the proven process. Your application matches your platform. (round peg into square hole... imagery) | ||
32 | Infrastructure | Build cloud infrastructure. Plumb in external cloud dependencies Think about your network, your firewalls, your IAM. Who and what will access your cloud infrastructure? How does your existing policies affect this and will you need to change them?Y | Build Kubernetes Infrastructure; adoption of declarative solutions for IaaS | Refine operations; develop monitoring, alerting and resource usage capabilities | Looking at cluster API for deploying clusters | Build, upgrade and backup systems and infrastructure via software and tooling - further maturing the the software-defined data center | ||
33 | Container and Runtime Management | Application containerization; adoption of container registries | Scan containers at build time and rest. Start gaining metrics. Runtime scanning | Gain continuous visibility into your Kubernetes security posture by automatically scanning containers for vulnerabilities and auditing clusters for weaknesses. | Visibility, have some alerts. Need a way to aggregate this. | Automate security tasks. You have all your security information in one centralized repository, You are automating where possible, remediating. | ||
34 | Application Release and Operations | Ad-hoc deployments using tooling like kubectl and kustomize to deploy to clusters | Push deployments using external tooling like Jenkins/kubectl and kustomize to deploy to clusters for production. Write YAML! Developers and Operations engineers becoming very familiar with YAML, and tooling associated with it. | Initial testing with GitOps operators. Write Helm Charts | Developers using GitOps operators for development and test | Full production usage of GitOps operators with appropriate controls and Git workflow | ||
35 | Testing and Issue Detection | Non-production testing (UAT, smoke test) conducted manually on identified business application within an organisation's environment. Basic Kubernetes functionality such as network connectivity, deployment of application(s) | Experiment tooling to help with security, policy management, workload misconfigurations, resource requests and limits. | Implement tooling to help with security, policy management, workload misconfigurations, resource requests and limits. Proper alerts, good dashboards. Full production with this tooling. | Having built a corpus of issues, refine them and build out automation to resolve them. | Automate issue detection during application development to prevent mistakes from entering production in the first place. | ||
36 | Security and Policy | Build your secured CI/CD pipeline; Release tooling may be used for deployment. Identify gaps in existing CI-CD pipeline. What you are doing today with VMs will end up quite different. | Developers and Ops groups following good practices with secrets (not checking them in to Git!) and other vectors for abuse | Automate deployment guardrails and security best practices through Open Policy Agent (OPA) or Kyverno integration with CI/CD. Policy as Code. Effective management of secrets that follow your organizations requirements. You tie development group to secrets and can organize and secure it. | conducting dry-runs of policy against production; tuning Core Policy as Code policies in production with realtime remediation | Ongoing optimisation and adjustment in line with ongoing and new requirements, adjusting to the ongoing threat environment. Exceptions are both minimised and formally controlled. | ||
37 | Teams exploring cloud native, Kubernetes. Not just exploring for the sake of it. Goal to reach production. Formal MVP programs underway. | formalisation of centralised services and responsibilities; consolidation of tooling; culling/evaporation of non-cloud native tooling | high degree of centralisation - good skills centralised, with clear and understood responsibilities but a decrease in velocity and chokepoints in process. Things slow down. | commence decentralisation journey - your platform is now established Commence Development of a limited self service portal reflecting the policy and process environment of the org; actively soliciting policy guidance | self-provisioning amongst different groups Org-wide acceptance of self service portal | |||
38 | Business outcomes | Deciding to adopt a cloud native approach to your application or services is usually driven by business reasons. You should have documentation as to what your business goals are and how cloud native helps you achieve those. Some examples may include: Scale to 1 million users: Provide flexible, scalable infrastructure based on users at any given time equipped with fast failover in the event of a problem. Deliver exceptional customer experience: Ensure the app is reliable as to not frustrate users Get features to market faster: Enable a microservices approach to building apps. Smaller teams are more agile because each team has a focused function. APIs minimize the amount of cross-team communication required to build and deploy. | You have completed a successful POC. Based on the POC, you should have initial findings on how cloud native will help improve your app. In a dev environment, you could, for example, have seen that: An app is using less resources (cost savings / more efficient use) A new feature shipped faster (faster time to market and thus increased revenue) There was no downtime (improved reliability for customers) These are just examples, they are not a guarantee based on your environment. Results may vary | You should use your documented business goals to track progress, but remember it won't be immediate on day 1. Business outcomes may include: Reduced spend on app infrastructure Reduced team focus on app infrastructure (note: this will happen over time in this phase as the team gets more confident in their skills) Increased security for the application Improve compliance as you can restrict and track access to the application Accelerated development life cycles as you implement CI/CD pipelines and thus get features to customers faster In this phase, it’s important that the business outcomes are examined and explained to business stakeholders. It should be a discussion with engineering leadership, the application owner (finance, marketing, etc), the CEO, and even the board. Without these discussions and alignment, maturing to the next phases will come with little appreciation and possibly even skepticism. | Up to this point, your teams have been focusing on learning cloud native. In this stage, your business outcomes are dependent on your team’s experience. As the team builds confidence, their competency around security, efficiency and reliability grows and they will implement defined processes for scale. All of these will impact your services and applications as the team improves. Your team may have to revisit some decisions made earlier as cloud native technologies were rolled out. That could set you back slightly, but the goal is to ensure there is no missing functionality, no single point of failure or disappointing performance. Monitoring is implemented. This will help the business get reports on what’s working and what isn't working. While the monitoring may be very specific, it will also provide insights into: Resource utilization to control costs Performance to ensure availability | Your team as cloud native confidence. Now it’s time to take that knowledge and apply it more thoroughly to your business goals. Your team will be able to focus on your business instead of maintaining Kubernetes. Alignment on goals. If goal was security, you will spend time on this for example. If it was to increase speed of development, you will ship faster. You will have measurement of what you've put in place. This is important as it will be used to demonstrate business outcomes. The business should expect to see: Established protocols and procedures Policy enforcement of compliance standards Comparison of cloud native apps vs. non-cloud native The business should expect more reporting in this phase. Reporting should cover compliance, security, performance and cost. These should be easily aligned to the business goals established in phase one. For some, at this point, you may start to migrate your other applications and have a better understanding of what you want to achieve. | You should have achieved your business outcomes. You should have measurable results to show your leadership teams from the CEO to the CFO and the board. You will continue to optimize your workloads against further/more advanced cost and performance metrics. You will never stop optimizing your cloud native infrastructure and apps. Here the expected business outcome is the ability to track how optimization continues to move the bar against established goals. You may also revisit your goals at this point, adjusting them to what has been achieved and what you want to achieve in future. You’ll automate as much as possible according to cloud native best practices to remove human error as to avoid security and performance problems. | |
39 | Formal commitment to POC (Proof of Concept). Business case identified - either an application or a workload that won't fit legacy. | |||||||
40 | ||||||||
41 | ||||||||
42 | ||||||||
43 | ||||||||
44 | ||||||||
45 | ||||||||
46 | ||||||||
47 | ||||||||
48 | ||||||||
49 | ||||||||
50 | ||||||||
51 | ||||||||
52 | ||||||||
53 | ||||||||
54 | ||||||||
55 | ||||||||
56 | ||||||||
57 | ||||||||
58 | ||||||||
59 | ||||||||
60 | ||||||||
61 | ||||||||
62 | ||||||||
63 | ||||||||
64 | ||||||||
65 | ||||||||
66 | ||||||||
67 | ||||||||
68 | ||||||||
69 | ||||||||
70 | ||||||||
71 | ||||||||
72 | ||||||||
73 | ||||||||
74 | ||||||||
75 | ||||||||
76 | ||||||||
77 | ||||||||
78 | ||||||||
79 | ||||||||
80 | ||||||||
81 | ||||||||
82 | ||||||||
83 | ||||||||
84 | ||||||||
85 | ||||||||
86 | ||||||||
87 | ||||||||
88 | ||||||||
89 | ||||||||
90 | ||||||||
91 | ||||||||
92 | ||||||||
93 | ||||||||
94 | ||||||||
95 | ||||||||
96 | ||||||||
97 | ||||||||
98 | ||||||||
99 | ||||||||
100 | ||||||||
101 | ||||||||
102 | ||||||||
103 |