Friday, March 25, 2022

Azure site to site vpn

 While setting up VPN tunnels, we need resilience and high availability of the solution if the VPN tunnel fails. The way we achieve high availability is using redundant tunnels. 

Most of the networks today follow Hub and Spoke model which means a network is most likely consists of multiple Azure VNets peered together. The network address space is propagated when VNets are peered. For Onprem network to discover Azure network space across all peered VNets and to automatically failover to redundant VPN tunnel, BGP protocol is used

Following is the microsoft document that explains different strategies for highly available VPN Gateways. 

https://docs.microsoft.com/en-us/azure/vpn-gateway/vpn-gateway-highlyavailable

From the document above I am going to cover redundant tunnels using BGP explained in the section  "Multiple on-premises VPN devices"


To establish highly available redundant tunnels, Azure VPN Gateway automatically comes with Active and Standby mode, which means we don't need 2 gateway endpoints on Azure side. If the active tunnel fails, it would automatically failover to same public ip and BGP ip for Standby. 

This however means, we need to have 2 VPN connections on Onprem side with unique Public IPs. 

I will give you an example here to show how to set this up.

 

Virtual Network Gateway

When creating new virtual network gateway, I have highlighted the red boxes that needs to be set apart from other network related selection you would do. 



When the resource is deployed, this sets up Azure side of gateway and assign a public IP for the Azure gateway and a private BGP IP. You cannot change these values. ASN can be changed if desired.



Local Network Gateway

Local network gateway is the onprem side of gateway. This is still an Azure resource but this is needed to represent the customer/onprem gateway as this will contain the configurations of onprem side of things. 

Remember we need 2 tunnels from Onprem for high availability, therefore, we need 2 separate Local Network Gateways, each having a connection to Onprem VPN device. 
I will only show one tunnel setup, you can create another Local Network Gateway(LNG) and a connection for 2nd VPN Onprem and that will complete your redundant Onprem tunnels with auto failover.

When creating local network gateway resource, the IP address is needed to be filled with Onprem VPN public IP address.


BGP setup with Onprem ASN and BGP private IP address 




After LNG is setup, you can change configuration from the settings



The screenshot shows text boxes with Onprem Public IP, ASN and BGP IP. These are the Onprem/Customer gateway values. 

Address Space text box can stay empty as we are using BGP, which learns the address space for Onprem automatically. I will show you screenshot with learned routes that are advertised from Onprem.


Create new connection by selecting the LNG created in above step and then add a connection to it.



After connection is setup, you can change its configuration. 

Tip: if the connection is not being established with Onprem, check the IPsec / IKE Policy and your Onprem VPN device may require different Encryption that can be configured using Custom policy.




Advertised and Learned Routes

If your connection is established properly, you can go to BGP Peers under monitoring of the virtual network gateway to see if your gateway is properly leaning and advertising the BGP routes. 

Learned Routes:



Advertised Routes






At this point if you can see advertised routes and learned routes in above screenshots and status connected and some numbers in message sent and received columns as shown below then congratulations you have successfully setup BGP VPN connection from Azure to Onprem.




Thursday, December 16, 2021

Terraform: Azure Kubernetes(AKS) logs (stdout, stderr) optimization.

Application logs are essential to troubleshoot and monitor issues. All modern applications support some sort of logging. Container engines are also designed to support logging by containerized applications write to standard output (stdout) and standard error (stderr) streams. 

Azure monitor for containers provide ways to collect these logging streams in log analytics work space. Although this is convenient and all baked in solution for log capturing, this might not be a business case to store all of these streams (especially stdout) in log analytics as this can become extremely costly. 

If your business case is to reduce logging costs, then stdout is the main culprit that can raise monthly Azure bill significantly high. 

To check your logs usage, you can run the following queries against you Azure logs to get the idea.

Logs Query: 

source

let startTime = ago(1h);
let containerLogs = ContainerLog
| where TimeGenerated > startTime
| where _IsBillable == true
| summarize BillableDataMBytes = sum(_BilledSize)/ (1000. * 1000.) by LogEntrySource, ContainerID;
let kpi = KubePodInventory
| where TimeGenerated > startTime
| distinct ContainerID, Namespace;
containerLogs
| join kpi on $left.ContainerID == $right.ContainerID
| extend sourceNamespace = strcat(LogEntrySource, "/", Namespace)
| summarize MB=sum(BillableDataMBytes) by sourceNamespace
| render piechart
 


To reduce these logs data, we can turn off the stdout and env_var (if not required to collect environment variables data). 

Note: Even if we disable stdout logs collection that doesn't mean we can't debug applications by looking at live logs, the logs are still being generated and they can be viewed by either running "kubectl logs" command for the pod or from Azure portal this is available from Live Logs from workload/pods section. 

By default all the logs are enabled once oms_agent is enabled with the log analytics. Below is an example to configure in terraform for the aks kubernetes cluster resource.

TerraForm Code:

addon_profile {
    oms_agent {
      enabled                    = true
      log_analytics_workspace_id = var.log_analytics_resource_id
    }
}


To fine tune and disable the logs that are not required for collection, oms agent looks for a specific config map named "container-azm-ms-agentconfig". Below is all you need to create a config map which is an example how to disable stdout and environment variables collection for log analytics. 

TerraForm Code:

resource "kubernetes_config_map" "aks_config_map_log_collection" {
  metadata {
    name      = "container-azm-ms-agentconfig"
    namespace = "kube-system"
  }

  data = {
    schema-version               = "v1"
    config-version               = "1.0.0"
    log-data-collection-settings = <<-EOF
      [log_collection_settings]
          [log_collection_settings.stdout]
          enabled = false
          [log_collection_settings.env_var]
          enabled = false
      EOF
  }
}

You can further configure disabling these logs as per excluding namespaces. The details are at the following link.

https://docs.microsoft.com/en-us/azure/azure-monitor/containers/container-insights-cost#controlling-ingestion-to-reduce-cost

Wednesday, November 24, 2021

Azure Highly Available Site to Site(S2S) VPN Using BGP

While setting up VPN tunnels, we need resilience and high availability of the solution if the VPN tunnel fails. The way we achieve high availability is using redundant tunnels. 

Most of the networks today follow Hub and Spoke model which means a network is most likely consists of multiple Azure VNets peered together. The network address space is propagated when VNets are peered. For Onprem network to discover Azure network space across all peered VNets and to automatically failover to redundant VPN tunnel, BGP protocol is used

Following is the microsoft document that explains different strategies for highly available VPN Gateways. 

https://docs.microsoft.com/en-us/azure/vpn-gateway/vpn-gateway-highlyavailable

From the document above I am going to cover redundant tunnels using BGP explained in the section  "Multiple on-premises VPN devices"


To establish highly available redundant tunnels, Azure VPN Gateway automatically comes with Active and Standby mode, which means we don't need 2 gateway endpoints on Azure side. If the active tunnel fails, it would automatically failover to same public ip and BGP ip for Standby. 

This however means, we need to have 2 VPN connections on Onprem side with unique Public IPs. 

I will give you an example here to show how to set this up.

 

Virtual Network Gateway

When creating new virtual network gateway, I have highlighted the red boxes that needs to be set apart from other network related selection you would do. 



When the resource is deployed, this sets up Azure side of gateway and assign a public IP for the Azure gateway and a private BGP IP. You cannot change these values. ASN can be changed if desired.



Local Network Gateway

Local network gateway is the onprem side of gateway. This is still an Azure resource but this is needed to represent the customer/onprem gateway as this will contain the configurations of onprem side of things. 

Remember we need 2 tunnels from Onprem for high availability, therefore, we need 2 separate Local Network Gateways, each having a connection to Onprem VPN device. 
I will only show one tunnel setup, you can create another Local Network Gateway(LNG) and a connection for 2nd VPN Onprem and that will complete your redundant Onprem tunnels with auto failover.

When creating local network gateway resource, the IP address is needed to be filled with Onprem VPN public IP address.


BGP setup with Onprem ASN and BGP private IP address 




After LNG is setup, you can change configuration from the settings



The screenshot shows text boxes with Onprem Public IP, ASN and BGP IP. These are the Onprem/Customer gateway values. 

Address Space text box can stay empty as we are using BGP, which learns the address space for Onprem automatically. I will show you screenshot with learned routes that are advertised from Onprem.


Create new connection by selecting the LNG created in above step and then add a connection to it.



After connection is setup, you can change its configuration. 

Tip: if the connection is not being established with Onprem, check the IPsec / IKE Policy and your Onprem VPN device may require different Encryption that can be configured using Custom policy.




Advertised and Learned Routes

If your connection is established properly, you can go to BGP Peers under monitoring of the virtual network gateway to see if your gateway is properly leaning and advertising the BGP routes. 

Learned Routes:



Advertised Routes






At this point if you can see advertised routes and learned routes in above screenshots and status connected and some numbers in message sent and received columns as shown below then congratulations you have successfully setup BGP VPN connection from Azure to Onprem.





LAMP Architecture on AWS