Thursday, December 16, 2021

Terraform: Azure Kubernetes(AKS) logs (stdout, stderr) optimization.

Application logs are essential to troubleshoot and monitor issues. All modern applications support some sort of logging. Container engines are also designed to support logging by containerized applications write to standard output (stdout) and standard error (stderr) streams. 

Azure monitor for containers provide ways to collect these logging streams in log analytics work space. Although this is convenient and all baked in solution for log capturing, this might not be a business case to store all of these streams (especially stdout) in log analytics as this can become extremely costly. 

If your business case is to reduce logging costs, then stdout is the main culprit that can raise monthly Azure bill significantly high. 

To check your logs usage, you can run the following queries against you Azure logs to get the idea.

Logs Query: 

source

let startTime = ago(1h);
let containerLogs = ContainerLog
| where TimeGenerated > startTime
| where _IsBillable == true
| summarize BillableDataMBytes = sum(_BilledSize)/ (1000. * 1000.) by LogEntrySource, ContainerID;
let kpi = KubePodInventory
| where TimeGenerated > startTime
| distinct ContainerID, Namespace;
containerLogs
| join kpi on $left.ContainerID == $right.ContainerID
| extend sourceNamespace = strcat(LogEntrySource, "/", Namespace)
| summarize MB=sum(BillableDataMBytes) by sourceNamespace
| render piechart
 


To reduce these logs data, we can turn off the stdout and env_var (if not required to collect environment variables data). 

Note: Even if we disable stdout logs collection that doesn't mean we can't debug applications by looking at live logs, the logs are still being generated and they can be viewed by either running "kubectl logs" command for the pod or from Azure portal this is available from Live Logs from workload/pods section. 

By default all the logs are enabled once oms_agent is enabled with the log analytics. Below is an example to configure in terraform for the aks kubernetes cluster resource.

TerraForm Code:

addon_profile {
    oms_agent {
      enabled                    = true
      log_analytics_workspace_id = var.log_analytics_resource_id
    }
}


To fine tune and disable the logs that are not required for collection, oms agent looks for a specific config map named "container-azm-ms-agentconfig". Below is all you need to create a config map which is an example how to disable stdout and environment variables collection for log analytics. 

TerraForm Code:

resource "kubernetes_config_map" "aks_config_map_log_collection" {
  metadata {
    name      = "container-azm-ms-agentconfig"
    namespace = "kube-system"
  }

  data = {
    schema-version               = "v1"
    config-version               = "1.0.0"
    log-data-collection-settings = <<-EOF
      [log_collection_settings]
          [log_collection_settings.stdout]
          enabled = false
          [log_collection_settings.env_var]
          enabled = false
      EOF
  }
}

You can further configure disabling these logs as per excluding namespaces. The details are at the following link.

https://docs.microsoft.com/en-us/azure/azure-monitor/containers/container-insights-cost#controlling-ingestion-to-reduce-cost

LAMP Architecture on AWS