ISSN: 2754-6659 | Open Access

Journal of Artificial Intelligence & Cloud Computing

ML Powered Container Management Platform: Revolutionizing Digital Transformation through Containers and Observability

Author(s): Amreth Chandrasehar

Abstract

As companies adopt digital transformation, cloud-native applications become a critical part of their architecture and roadmap. Enterprise applications and tools are developed using cloud native architecture are containerized and are deployed on container orchestration platforms. Containers have revolutionized application deployments to help management, scaling and operations of workloads deployed on container platforms. But a lot of issues are faced by operators of the platforms such as complexity in managing large scale environments, security, networking, storage, observability and cost. This paper will discuss on how to build a container management platform using monitoring data to implement AI and ML models to aid organization digital transformation journey.

Introduction

As companies are accelerating Digital Transformation, companies are increasingly investing in cloud-native development based on microservices and container architectures. As companies migrate their workloads to container architectures, container management becomes more important to ensure reliability and scalability of the applications hosted on Kubernetes. Container Management is used for creating, deploying, operating, monitoring and scaling applications. Applications, dependencies can be bundled together into a single container to easier deployment and management aiding greater agility and quicker deployment of applications.

One reason containers became so popular with developers is that you can encapsulate approved version of an environment in a container image and have all the developers work from that image. This way, developers working on new applications will use the exact parameters their code will run on in production. This should make it seamless to go from test to dev to prod. No more “well it worked in dev…” or “it worked on my system….”, because they will be coding in a replica of the approved environment [1].

Container management software such as Kubernetes, Mesos or Docker Swarm provides capabilities for developers and operators to manage the applications deployed. The automation, monitoring and auto-remediation features such as replacing failed pods, auto-scaling to help the developers and operators (SRE, DevOps) engineers. The need occurred when the enterprises had to maintain too many containers, manage applications across multiple environments, cloud providers and it became complex to manually manage by the teams.

AI powered Container Management Platform will help enterprises and smaller organizations to maintain their applications, increase developer productivity, improve application reliability and integrate with auto remediation platform to resolve issues without human intervention.

Review of Cloud Modernization and Digital Transformation at Organizations

Cloud modernization and digital transformation are two closely related concepts that are essential for businesses to stay ahead of the competition.

• Cloud modernization is the process of moving IT infrastructure and applications to the cloud. This can help to improve agility, scalability, and cost-efficiency.

• Digital transformation is the process of using digital technologies to change the way of doing business. This can help to improve customer experience, create new products and services, and gain a competitive advantage.

Cloud modernization and digital transformation are often seen as separate initiatives, but they are actually two sides of the same coin. By modernizing the IT infrastructure and applications, foundation is laid for digital transformation.

Some of the benefits of cloud modernization and digital transformation are:

• Improved Agility: Cloud-based systems are more agile and can be scaled up or down quickly to meet changing demand. This will help to respond more quickly to market changes and customer needs.

• Increased Scalability: Cloud-based systems can be scaled up or down to meet changing demand. During peak or sustained load, scaling the resources in Cloud or container-basedsolutions will help to meet customer demand.

• Reduced Costs: Cloud-based systems can help to reduce costs by eliminating the need to purchase and maintain physical hardware and software. Pay per use model helps to reduce the cost significantly.

• Improved Security: Cloud providers offer a variety of security features that can help to protect data and applications. Cloud providers help inform customers at times due to leaked credentials, crypto mining on resources or hijacked accounts, etc.

• Improved Compliance: Cloud providers can help to comply with industry regulations, such as those related to data privacy and security.

To improve business agility, scalability, and cost-efficiency, then cloud modernization and digital transformation are essential initiatives.

Some additional tips for cloud modernization and digital transformation:

• Start with a clear understanding of business goals.
• Define scope, goals and outcome of cloud modernization and digital transformation
• Involve all stakeholders such as your IT team, business leaders, and customers.
• Choose the right cloud platform that is right for business and technology needs.
• Track and measure the progress. It is important to measure the progress to adjust as needed.

As organizations move towards containerized workloads and dynamic microservice architectures, old practices of bolting on monitoring after the fact no longer scale. It’s critical therefore that modern instrumentation should be employed to better understand the properties of an application and its performance as complex distributed systems take shape across the delivery pipeline and into production [2].

Building Container Management Platform (CMP), A Theoretical Framework

Container orchestration, then, is the process of managing the complete lifecycle of containers in an automated fashion. We’re talking about deployment, provisioning of hosts, scaling, health monitoring, resource sharing and load balancing etc. All these individual tasks associated with managing containers pile up and up as your projects grows and before you know it, you’ve got quite a complex problem. That is the challenge that container orchestration solutions are trying to solve [3].

As companies CNCF framework and tools to deploy and manage workloads in cloud, building a Container management Platform is required for teams to effectively manage the applications. CMP requires container cluster, monitoring, CMDB, change management, security scanning tool integrations to enhance developer and operator productivity.

Below is a data flow architecture diagram of a container management platform. The platform containers automated issue resolution workflow, self-service dashboards, executive dashboards, application management user interface, monitoring with SRE metrics, alerts and also ML and AI features.

The data is collected across various container orchestration servers such as Kubernetes, Mesos, Docker Swarm, ECS and sent to the central CMP Database. CMDB, Change Management tools, Container Scanning tools and also Code Pull Request information from Source Code repository are collected and stored in the CMP database.

img

Best practices for building container management platform:
1. Simplified administration
2. Self-service features for end users
3. Enabling automated monitoring for all containers
4. HA and Scaling control plane layer
5. Monitoring fast and disposable containers
6. Security and networking layer on all containers
7. Automated clean-up tasks. Helps to minimize data storage, 3rd party API and network cost

Implementing Monitoring and Alerting Methodology

One of the fundamental components of a workload management tool is monitoring. The 3 pillars of Observability - Logs, Metrics and Traces, along with Events needs to be collected into Observability Platform. The Container Management System will get these valuable data, apply ML models or use it as-is to generate alerts, apply correlation, build dashboards, etc.

“Metadata is what’s going to allow us to do all of this correlation but still build systems that are useful individually. Also as a whole, it will do all this great correlation across the signals,” said Branczyk [4].

Below is a data flow diagram from data creation, data collection, data transformation, data storage, data processing and making it available to end user (internal or external) customers.

img

Results and Findings

As organizations adopt Cloud modernization and digital transformation, management of workloads become very critical. Many organizations face challenges as they do lift and shift of workloads from on-premises systems to cloud and thus bringing all issues to Cloud as well. Also, many organizations have reported unsuccessful digital transformation and cloud journey

The following are critical factors can be applied for a successful migration and optimal use of workloads in a container management solution:

  1. Technical feasibility - Understand the technology, code, dependencies, versions, instance types, deployment model, etc are critical to make or break the migration.
  2. Cloud and workload Migration process
  3. Project Management with continuous improvement tracking
  4. Executive Sponsorship
  5. Define clear goals and objectives
  6. Infrastructure Blueprint - Create a blueprint for all platforms and cloud resources. Automation is critical to have high SLAs in the system.
  7. Environment Readiness for deploying applications and other infrastructure components
  8. Training and Support for internal and external users
  9. Observability platform to ingest logs, metrics, traces and events
  10. Use of AI and ML technologies to predict failures and forecast issues
  11. Automated Workflow for remediating issues instead of manual steps to fix commonly occurring issue

The CMP solution have helped organizations deliver 30% increase in developer productivity, 44% increase in MTTR and 35% increase in MTTD metrics. These metrics are an average of data points collected from CMP solutions deployed across organizations [5, 6].

Discussing Implications

Many organizations continue to face challenges and difficulties with cost, monitoring, developer productivity, efficiency, incidents and poor customer service. Applying the best practices and principles discussed in this paper will help organizations manage their workloads efficiently.

The container management platform can be very effective tool to accelerate developer and operations team productivity, improve time taken for production deployment and rollbacks. Integrating CMP with Observability will also enable proactive alerting and quick lookup of logs, metrics and traces at one place, this will improve finding issues quicker improving SRE metrics MTTR, MTTD.

Conclusion

Container Management Platform is an essential part of Cloud Modernization and Digital Transformation. Observability is a key factor in order to improve your product quality [5]. Observability emphasize the importance of how it helps tear down the silos between different teams as Observability is not just for an SRE team or operations team or DevOps team or production support team it is for the entire company and hence "all of the above" it is the responsibility of everyone [6]. Combined with AI and ML technologies, it will allow developers and operators to spend less time on incidents and work on various other innovative projects.

References

  1. What are container management systems? (2019) 24x7 IT Connection https://24x7itconnection.com/2019/07/09/whatare-container-management-systems/.
  2. Monitoring and Observability-What’s the Difference and Why Does It Matter? (2018) The New Stack https://thenewstack.io/monitoring-and-observability-whats-the-difference-andwhy-does-it-matter/.
  3. Container Orchestration in 2019 (2019) Scout APM https:// scoutapm.com/blog/container-orchestration-in-2019.
  4. What's Next for Observability (2019) Grafana Labs https:// grafana.com/blog/2019/10/21/whats-next-for-observability/.
  5. Observability at System73 (2019) Medium https://medium.com/system73/observability-at-system73-f20edf796ae1.
  6. Observability New Buzz Word (2019) Linked In https:// www.linkedin.com/pulse/observability-new-buzz-worddeepti-bhutani/.
View PDF