Edit

Share via


Set up the Azure Kubernetes Service communication manager

The Azure Kubernetes Service (AKS) communication manager streamlines notifications for all your AKS maintenance tasks by using Azure Resource Notifications and Azure Resource Graph frameworks. The communication manager gives you timely alerts on event triggers and outcomes, so that you can closely monitor your upgrades.

If maintenance fails, the communication manager notifies you with the reasons for the failure. This information reduces operational hassles related to observability and follow-ups.

By following the steps in this article, you can set up notifications for all types of automatic upgrades that use maintenance windows.

Prerequisites

Set up the communication manager

  1. In the Azure portal, go to the resource.

  2. Select Monitoring > Alerts > Alert Rules, and then select Create.

  3. On the Condition tab, for Signal name, select Custom log search.

    Screenshot that shows the selection of a custom log search on the pane for creating an alert rule.

  4. In the Search query box, paste one of the following custom queries. Be sure to update the where id contains path to reference your resources for subscription ID, resource group name, and cluster name.

    The following query is for notifications of automatic upgrades for clusters:

    arg("").containerserviceeventresources
    | where type == "microsoft.containerservice/managedclusters/scheduledevents"
    | where id contains "/subscriptions/<subid>/resourcegroups/<rgname>/providers/Microsoft.ContainerService/managedClusters/<clustername>"
    | where properties has "eventStatus"
    | extend status = substring(properties, indexof(properties, "eventStatus") + strlen("eventStatus") + 3, 50)
    | extend status = substring(status, 0, indexof(status, ",") - 1)
    | where status != ""
    | where properties has "eventDetails"
    | extend upgradeType = case(
                               properties has "K8sVersionUpgrade",
                               "K8sVersionUpgrade",
                               properties has "NodeOSUpgrade",
                               "NodeOSUpgrade",
                               ""
                           )
    | extend details = parse_json(tostring(properties.eventDetails))
    | where properties has "lastUpdateTime"
    | extend eventTime = substring(properties, indexof(properties, "lastUpdateTime") + strlen("lastUpdateTime") + 3, 50)
    | extend eventTime = substring(eventTime, 0, indexof(eventTime, ",") - 1)
    | extend eventTime = todatetime(tostring(eventTime))
    | where eventTime >= ago(30m) // Ensure this matches aggregation granularity & frequency
    | where upgradeType == "K8sVersionUpgrade"
    | project
        eventTime,
        upgradeType,
        status,
        properties,
        name,
        details
    | order by eventTime asc
    

    The following query is for notifications of automatic upgrades for NodeOS:

    arg("").containerserviceeventresources
    | where type == "microsoft.containerservice/managedclusters/scheduledevents"
    | where id contains "/subscriptions/<subid>/resourcegroups/<rgname>/providers/Microsoft.ContainerService/managedClusters/<clustername>"
    | where properties has "eventStatus"
    | extend status = substring(properties, indexof(properties, "eventStatus") + strlen("eventStatus") + 3, 50)
    | extend status = substring(status, 0, indexof(status, ",") - 1)
    | where status != ""
    | where properties has "eventDetails"
    | extend upgradeType = case(
                               properties has "K8sVersionUpgrade",
                               "K8sVersionUpgrade",
                               properties has "NodeOSUpgrade",
                               "NodeOSUpgrade",
                               ""
                           )
    | extend details = parse_json(tostring(properties.eventDetails))
    | where properties has "lastUpdateTime"
    | extend eventTime = substring(properties, indexof(properties, "lastUpdateTime") + strlen("lastUpdateTime") + 3, 50)
    | extend eventTime = substring(eventTime, 0, indexof(eventTime, ",") - 1)
    | extend eventTime = todatetime(tostring(eventTime))
    | where eventTime >= ago(30m) // Ensure this matches aggregation granularity & frequency
    | where upgradeType == "NodeOSUpgrade"
    | project
        eventTime,
        upgradeType,
        status,
        properties,
        name,
        details
    | order by eventTime asc
    
  5. Go to the Condition tab. Configure the alert conditions with the following settings:

    • Measure: Select Table rows.
    • Aggregation type: Select Count.
    • Aggregation granularity: Select 30 minutes.
    • Threshold value: Keep at 0.
    • Split by dimensions: For Dimension name, select status. Then select the Include all future values checkbox.

    Screenshot of the configuration options for alert conditions.

  6. In the Split by dimensions area, for Dimension values, select a value. Because you selected status for the dimension name, the available values are Scheduled, Started, Completed, Canceled, and Failed.

    Note

    These status values appear only if your cluster previously executed automatic upgrade operations. For new clusters or for clusters that haven't undergone automatic upgrades yet, the dropdown list might appear empty or show no available dimensions. After your cluster performs its first automatic upgrade, these status values become available for selection.

    Screenshot of the dropdown list boxes in the area for splitting by dimensions.

  7. Go to the Actions tab. Make sure that an action group with the correct email address exists, so that you can receive the notifications:

    1. Select Use action groups > Create an action group.

    2. For Notification type, select Email/SMS_message/Push/Voice.

    3. Select the Email checkbox, and then enter the email address in the Email box.

      Screenshot of the pane for entering email information for an action group.

  8. Go to the Details tab. Assign a managed identity so that you can grant access to the necessary resources. In the Identity area, select System assigned managed identity.

    Screenshot that shows selections for assigning a system-assigned managed identity.

  9. Go to the Review + create tab, and then select Create.

  10. Now that you've created the alert rule, you can assign the appropriate roles for the managed identity:

    1. In the alert rule, go to Settings > Identity > System assigned managed identity > Azure role assignments.
    2. Select Add role assignment, select the Reader role, and assign it to the resource group.
    3. Select Add role assignment again, select the Reader role, and assign it to the subscription.

    Tip

    If you don't see the Identity option, make sure that you created your alert rule and that you have the necessary permissions.

After you set up the communication manager, it sends advance notices one week before maintenance starts and one day before maintenance starts. It also sends you timely alerts during the maintenance operation.

Verify the configuration

To upgrade the cluster, wait for the automatic upgrader to start. Then verify that you promptly receive notices on the email address that you configured to receive notices.

Check the Azure Resource Graph database for the scheduled notification record. Each scheduled event notification should be listed as one record in the ContainerServiceEventResources table.

Screenshot that shows a notification record in Azure Resource Graph.