Microsoft AKS updates 2024 - Q4 and KubeCon2024 North America.
Within this blog, I want to give an overview of all the feature in Q4 2024 that becomes available in General Availability, Technical Preview or End of Support by Microsoft. This information can be found at Microsoft Azure Updates.
Features that are now supported by Microsoft (GA):
- [General available] Parallel Image Pulls by Default in AKS
There are two types of container image pulls: serialized and parallel. By default, AKS versions earlier than 1.31 use serialized image pulls. Starting with AKS version 1.31 preview, AKS defaults to parallel image pulls. Generally, serialized image pulls are less performant than parallel pulls, especially when dealing with large or numerous container images. This update helps to enhance overall system efficienc. To learn more, click here. - [General available] Open-source feature update: vLLM model serving in KAITO
KAITO now supports high throughput model serving with the open-source vLLM serving engine. In the KAITO inference workspace, you can deploy models using vLLM to batch process incoming requests, accelerate inference, and optimize your AI workload by default, preventing out-of-memory (OOM) errors and minimizing workload disruptions. To learn more, click here. - [General available] Force attach/detach API support in AKS
With the release of Kubernetes v1.30, the Azure Disk CSI driver has adopted the force detach capability. This feature allows the driver to force detach zone-redundant storage (ZRS) data disks from VM nodes in a failed zone and attach them to another VM, reducing the Recovery Time Objective (RTO). Stateful workloads in AKS clusters can now quickly recover from zone failures by detaching ZRS data disks from affected VM nodes and reattaching them to new VMs. To learn more, click here. - [General available] Vaulted backup for AKS
Azure Backup now supports vaulted backups for AKS, enabling cross-region disaster recovery, long-term retention and immutable security. This simplifies compliance and strengthens resilience for cloud-native applications. Customers can protect clusters during a regional disaster recovery, store backup data for up to 10 years to meet compliance requirements, and secure backup data at an offsite location to safeguard against ransomware threats. To learn more, click here. - [General available] AKS automated deployments UI
The AKS automated deployments feature has received UI updates designed to make it easier for customers to get started with Azure Kubernetes Service. With these new improvements, you can now select specific locations for saving autogenerated Dockerfiles and Kubernetes manifest files. With automated deployments, you can easily get your apps up and running on Azure Kubernetes Service. To learn more, click here. - [General available] Enchanced AKS logs with Kubernetes metadata and logs filtering
Azure Kubernetes Service (AKS) logs now include detailed metadata, such as PodLabels, PodAnnotations, PodUid, Image, ImageID, ImageRepo, and ImageTag. These additions provide richer context and improved visibility into workloads, aiding in troubleshooting and monitoring. The integration with Grafana further enhances this by enabling users to visualize and analyze logs more effectively, utilizing Grafana's powerful dashboard capabilities. To learn more, click here. - [General available] Regional Disaster Recovery by Azure Backup for AKS
In today’s digital landscape, cloud administrators face the challenge of ensuring data and applications remain protected, compliant, and resilient amidst evolving cyber threats. Microsoft is excited to announce the general availability (GA) of Vaulted Backup support in Azure Backup for Azure Kubernetes Service (AKS). This new feature helps organizations meet compliance requirements, enhance operational resilience, and protect cloud-native applications from regional disasters. Key Benefits for Azure Customers:- Disaster Recovery Across Regions: Cross-Region Restore supports critical failover capabilities, ensuring business continuity and disaster recovery compliance;
- Regulatory Compliance: Achieve long-term retention (LTR) for up to 10 years to meet global compliance frameworks;
- Enhanced Security & Cyber Resilience: Immutable vaults and role-based access control protect backup data from ransomware and unauthorized access.
With GA support for Vaulted backups of AKS, Azure Backup simplifies compliance, strengthens security, and improves resilience for cloud-native environments. Activate Azure Backup for your AKS clusters today to enable robust, compliant data protection. To learn more, click here.
- Disaster Recovery Across Regions: Cross-Region Restore supports critical failover capabilities, ensuring business continuity and disaster recovery compliance;
- [General available] Enhancements on Azure Container Storage for performance, scalability and operational insights
Microsoft is excited to unveil key advancements in Azure Container Storage, our purpose-built storage solution for stateful containers on Azure Kubernetes Service (AKS). These updates bring increased performance and improved reliability to support even the most demanding containerized applications. Microsoft has optimized ephemeral disk performance to improve read and write IOPS. These enhancements include updates to local NVMe performance both with and without replication, detailed via this link Optimize performance when using local NVMe without replication. and Optimize performance when using local NVMe with replication. . With these new capabilities, Azure Container Storage empowers you to achieve optimal performance, visibility, and resilience, making it an even stronger choice for deploying scalable, containerized applications in the cloud. Join us at Ignite to explore how these updates can elevate your cloud storage strategy! To learn more, click here. - [General available] Kubernetes Metadata and Log Filtering in Azure Monitor Container Insights
Microsoft is excited to announce the general availability of Kubernetes Metadata and Logs Filtering in Azure Monitor – Container Insights! This enhancement adds Kubernetes metadata to the ContainerLogsV2 schema, including PodLabels, PodAnnotations, PodUid, Image, ImageID, ImageRepo, and ImageTag. Users can customize metadata fields via ConfigMap, and all fields are collected by default. The new Logs Filtering feature allows for precise filtering of both workload and system pods/containers. Additionally, the enhanced ContainerLogV2 schema includes log level information to assess application health with color-coded severity levels. A Grafana Dashboard is available for visualizing log levels, log volume, rate, records, and more, empowering in-depth analysis and real-time monitoring. These advancements provide users with richer context and improved visibility into their workloads. To learn more, click here. - [General available] Trusted launch for Azure Kubernetes Service (AKS)
Trusted Launch enabled nodes are now generally available on AKS. Trusted launch improves the security of generation 2 virtual machines (VMs) by protecting against advanced and persistent attack techniques. It enables administrators to deploy AKS nodes, which contain the underlying virtual machines, with verified and signed bootloaders, OS kernels, and drivers. By using secure and measured boot, administrators gain insights and confidence of the entire boot chain's integrity. Trusted launch is composed of several coordinated infrastructure technologies that can be enabled independently. Each technology provides another layer of defense against sophisticated threats:- vTPM - Trusted launch introduces a virtualized version of a hardware Trusted Platform Module (TPM), compliant with the TPM 2.0 specification. It serves as a dedicated secure vault for keys and measurements;
- Secure Boot - At the root of trusted launch is Secure Boot for your VM. This mode, which is implemented in platform firmware, protects against the installation of malware-based rootkits and boot kits.
To learn more, click here.
- vTPM - Trusted launch introduces a virtualized version of a hardware Trusted Platform Module (TPM), compliant with the TPM 2.0 specification. It serves as a dedicated secure vault for keys and measurements;
- [General available] Extended workload scheduling capabilities in Azure Kubernetes Fleet Manager
There are two generally available capabilities as part of the extended scheduling: cluster and namespace-wide attribute overrides for resources. At the cluster level, operators can use ClusterResourceOverride to define rules based on cluster labels, specifying changes to be applied to various cluster-wide resources such as namespaces, cluster roles, cluster role bindings, or custom resource definitions (CRDs). These modifications might include updates to permissions, configurations, or other parameters. For namespace-level control, a ResourceOverride can be used define rules based on cluster labels, specifying changes to be applied to resources such as Deployments, StatefulSets, ConfigMaps, or Secrets. These changes can include updates to container images, environment variables, resource limits, or any other configurable parameters. Together these overrides help ensure consistent management and enforcement of configurations across your Fleet-managed Kubernetes clusters. To learn more, click here. - [General available] Intelligent cross-cluster Kubernetes resource placement using Azure Kubernetes Fleet Manager
Application developers often need to deploy Kubernetes resources into multiple clusters. Fleet operators often need to pick the best clusters for placing the workloads based on heuristics such as cost of compute in the clusters or available resources such as memory and CPU. It's tedious to create, update, and track these Kubernetes resources across multiple clusters manually. Fleet now provides an intelligent resource placement capability that can make scheduling decisions based on the following properties: node count, cost of compute in target member clusters, and resource (CPU/Memory) availability in target member clusters. To learn more, click here. - [General available] Advanced Container Networking Services
Advanced Container Networking Services (ACNS) is now generally available. ACNS includes Advanced Network Observability, providing pod-level metrics, DNS insights, and enhanced troubleshooting tools for network debugging in AKS. Additionally, FQDN filtering is also generally available, simplifying network policy management by using domain names instead of IP addresses. This reduces the need for frequent updates and minimizes configuration errors. Both features are integrated with Azure Monitor, enabling customizable metrics and pre-built dashboards in Azure Managed Grafana, improving network management and security for AKS environments. To learn more, click here to read the blog or the documentation. - [General available] Delete a specific machine when scaling down a nodepool
It is now possible to specifically choose which VM to delete and remove when scaling down a node pool in AKS. This provides greater control and flexibility in managing resources within the node pool. To learn more, click here. - [General available] Ignore PDBs on node pool deletion
Node pools in AKS can now be deleted even if there are pods monitored by a PodDisruptionBudget (PDB) – previously, the deletion of the node pool could fail due to an unsatisfied PDB. This enhancement allows the deletion to proceed by ignoring the PDB error that would previously block the deletion from continuing. To learn more, click here.
Features that are currently in Public Preview and not yet GA
- [Public Preview] Network isolated cluster in AKS
Today you can control an AKS cluster's egress traffic using Azure Firewall. While this configuration is intended to isolate the cluster to protect sensitive business or customer data, it adds an additional layer of management complexity and cost. AKS now provides the option to use network isolated clusters to simplify the process of restricting network access and reduce the risk of unintentional exposure of the cluster's public endpoints to prevent security breaches. To learn more, click here. - [Public Preview] IMDS restriction support in AKS
Currently, all pods on AKS nodes can access the AKS worker node's Azure Instance Metadata Service (IMDS) endpoint. AKS now offers a managed solution that restricts IMDS endpoint access for customer pods. Only AKS system pods and user pods with host network can access IMDS for retrieving information or authentication. To learn more, click here. - [Public Preview] AKS Communication Manager
The AKS Communication Manager, now in public preview, simplifies notifications for all your AKS maintenance tasks by leveraging Azure Resource Notification and Azure Resource Graph frameworks. It provides timely alerts on event triggers and outcomes, allowing you to closely monitor your upgrades. In case of maintenance failures, it notifies you with the reasons for the failure, reducing operational hassles related to observability and follow-ups. To learn more, click here. - [Public Preview] Windows GPUs for compute-intensive workloads on AKS
AKS now supports GPU-enabled Windows node pools, in public preview, to run compute-intensive Kubernetes workloads. AKS specifies a default GPU driver type for each supported GPU-enabled VM. Because workload and driver compatibility are important for functioning GPU workloads, you can now specify the driver type for your Windows GPU node. When creating a Windows agent pool with GPU support, you have the option to specify the type of GPU driver using the --driver-type flag. The available options are: "GRID" (For applications requiring virtualization support) and "CUDA" (Optimized for computational tasks in scientific computing and data-intensive applications). This feature is currently not supported for Linux GPU node pools. To learn more, click here. - [Public Preview] KAITO managed add-on now availabile in the AKS Visual Studio Code extention
The AI toolchain operator (KAITO) managed add-on is now available in the Azure Kubernetes Service (AKS) Visual Studio Code extension. With an intuitive and visually engaging UI, this add-on is designed to simplify AI inference development. Customers can enable KAITO, browse through the supported options, and choose an open-source AI model to deploy to their AKS cluster. Once deployed, they can access model logs, test performance, and fully engage with the model workspace directly from VSCode. To learn more, click here. - [Public Preview] AKS support for private ingress on cluste create or through API
When you enable the application routing add-on with NGINX, it creates an ingress controller configured with a public facing Azure Load Balancer. Starting with Kubernetes 1.30, you can control this behavior when enabling the add-on by choosing if it gets a public or an internal IP. To learn more, click here. - [Public Preview] GitHub Copilot for Azure - AKS plugins
GitHub Copilot for Azurenow supports Azure Kubernetes Service (AKS) plugins. The extension, which enables AKS plugins for GitHub Copilot for Azure (@azure), allows users to perform various tasks related to AKS directly from the GitHub Copilot Chat view. These tasks include creating an AKS cluster, deploying a manifest to an AKS cluster, and generating Kubectl commands. To learn more, click here. - [Public Preview] Upgrade algorithm improvements in AKS
AKS upgrades currently fail when encountering a Pod drain failure. To improve upgrade efficiency, a new algorithm is being introduced. It allows you to configure upgrades so that if a node is blocked, AKS will use any available surge capacity to continue upgrading other nodes, labeling the blocked node as 'quarantined'. Failure error messages are updated to reflect the post-upgrade status accurately. To learn more, click here. - [Public Preview] Multi-cluster Auto-upgrade in Azure Kubernetes Fleet Manager
latform administrators managing large number of Kubernetes clusters face the challenge of updating Kubernetes or node images in a way that is safe and predictable. To address this challenge, Azure Kubernetes Fleet Manager (Kubernetes Fleet) provides update runs which are used to deploy updates across multiple Kubernetes clusters. Kubernetes Fleet is announcing auto-upgrade support, which provides an automated trigger for update runs based on new Kubernetes or node image versions being published to Azure. Admins can create multiple auto-upgrade profiles for their fleet to capture combinations of Kubernetes and node image version updates. Additionally, existing update run strategies can be used to determine the order in which clusters are updated, while guaranteeing cluster upgrades occur within the maintenance windows configured at each cluster. To learn more, click here. - [Public Preview] Dynamic system node pool creation in AKS Automatic
AKS Automatic will now dynamically select an appropriate virtual machine SKU for the system node pool based on the capacity available in your Azure subscription. To learn more, click here. - [Public Preview] Auto instrumentation for AppInsigts in AKS
Auto-instrumentation for AppInsights is now available in public preview for Azure Kubernetes Service. Auto-instrumentation enables Application Insights to make telemetry like metrics, requests, and dependencies available in your Application Insights resource. It provides easy access to Application performance monitoring (APM) experiences such as the application dashboard and application map. Auto-instrumentation automatically injects the Azure Monitor OpenTelemetry distro into your application pods to generate application monitoring telemetry. This preview supports .NET, Java, and JavaScript (Node.js). Support for Python is coming soon. To learn more, click here. - [Public Preview] AKS security dashboard
AKS is introducing security dashboard in the portal. You can now have full visibility over the vulnerabilities of runtime and host in your AKS cluster. Defender for Cloud blade in the Azure Kubernetes Service (AKS) portal offers a simplified and streamlined experience for the resource owner or a cluster administrator, in preview. This capability is unique to Microsoft as it provides granular visibility into the container posture assessments (vulnerability assessment, compliance, security best practices, CVE remediation) and offers actionable security insights at a cluster level, without the cluster admin having to leave the AKS portal. To learn more, click here. - [Public Preview] Static egress gateway for AKS
Static egress gateway for Azure Kubernetes Service (AKS) is now in public preview. This feature allows AKS customers to configure a fixed source IP for out-of-cluster communications without incurring the significant cost of deploying a dedicated node pool with a NAT gateway. The static egress gateway enables precise control over egress traffic, simplifying integration with external systems and enhancing network security. To learn more, click here. - [Public Preview] Azure Linux 3.0 on Azure Kubernetes Service v1.31
Azure Linux 3.0, the next major version release of the Azure Linux container host for Azure Kubernetes Service (AKS), is now available in preview on AKS version 1.31. Azure Linux 3.0 offers increased package availability and versions, an updated kernel, and improvements to performance, security, and tooling and developer experience. To learn more, click here. - [Public Preview] SeccompDefault support in AKS
SeccompDefault is available in public preview as a new parameter through custom node configuration. Secure computing mode (seccomp) is used to restrict a container’s syscalls that can be sent to the kernel. This establishes an extra layer of protection against common system call vulnerabilities exploited by malicious actors and allows you to specify a default seccomp profile for all workloads in the node. There are two allowed values for SeccompDefault:- Unconfined is the default parameter value which does not block any syscalls;
- RuntimeDefault will block restricted syscalls in the containerd seccomp profile.
The RuntimeDefault profile allows a common set of syscalls while blocking those that are less likely to be used or potentially unsafe in a containerized application. This profile aims to provide a reliable set of security defaults while maintaining the functionality of the workload. To learn more, click here.
- Unconfined is the default parameter value which does not block any syscalls;
Features that are in Private Preview
- [Private preview] Web Application Firewall (WAF) running on Application Gateway for Containers
Application Gateway for Containers now supports Web Application Firewall (WAF) in private preview. WAF’s Default Ruleset provides Azure Kubernetes Service (AKS) users with centralized protection against malicious attacks and exploits such as:- Cross-site scripting;
- Java attacks;
- Local file inclusion;
- PHP injection attacks;
- Remote command execution;
- Remote file inclusion;
- Session fixation;
- SQL injection protection;
- Protocol attacks.
Additionally, Application Gateway for Containers WAF also supports protection against malicious bot activity via the bot manager rulesets. If you’re interested in signing up for this private preview, please fill out this intake form to start the onboarding process.
- Cross-site scripting;
Features that are retired
- [Retired] None
KubeCon 2024 North America Highlights created by Brendan Burns of Microsoft
- [Project areas and key contributions];
- [Azure Kubernetes Service announcements];