FinOps - Budget and Alerts

Posted by Nikos Tsirmirakis on 2023-10-30

In my previous post, I demonstrated how to add budget guardrails to the pull request process and check estimated costs of the environment based on Terraform code (IaC) using the Infracost tool.

In this post, we will add a budget and alerts as part of our Terraform code.

With Terreform we can configure the budget on the following resources and in our scenario we will configure it on the Resource Group.

  • Management Group
  • Subscription
  • Resource group

Alerts can be configured for current or forecast costs, and we will configure it for both to take advantage of the Microsoft forecasting algorithm.

Sample Terraform code is available in DBAinTheCloud GitHub repository.

Budget

The most challenging part is to figure out what is the realistic budget amount as this is a base for alerting. On the one hand, we do not want to keep receiving constant alerts and use the entire budget in the first few days, on another hand, we do not want to lock too much budget for a resource group. To initially estimate costs we can use an Azure price calculator or Infracost to automatically generate a cost estimate based on Terraform code.

In our scenario, the estimated cost of our AKS cluster with 2 nodes environment is £ 422.00 so we will add 20% for contingency (i.e.throuput costs which are very difficult to estimate upfront) and round it up to £ 510.00 (422.00 * 1.2 = 506.40). This calculation takes into account that our cluster is running 24/7, however, in a development environment this amount can be significantly reduced if the cluster is used only during the business hours (9-17). If we add additional hours for build and destroy infrastructure our cluster should be running for up to 50 hours a week which is around 30% of a weekly usage (29.76 %) so with 20% contingency and rounding up we can set a budget of £ 160.00 (422.00 * 0.3 * 1.2 = 151.92).

Budgeting and forecasting is an art on its own and we didn’t cover all scenarios and consider all aspects however as a rule of thumb it is a good starting point. We should be able to update it (budget and assumptions) after the first 3 months and review it every quarter.

Alerting

In our scenario, we have configured 4 levels of alerts for both actuarial and forecast spending. In our case, alert thresholds are 80%, 90%, 100% and 120%. Setting up alert thresholds very much depends on requirements and action plans in case of triggering it. Taking into the account way we have constructed our initial budget we should be triggering the first level (80%) every month so if we do not want to get too much of a noise we can either remove it or increase it to 85%.

Actual

# Actual

locals {
  budget_actual_notifications = {
    80.0  = { operator = "GreaterThanOrEqualTo", contact_emails = [local.notification_email] }
    90.0  = { operator = "GreaterThanOrEqualTo", contact_emails = [local.notification_email] }
    100.0 = { operator = "GreaterThanOrEqualTo", contact_emails = [local.notification_email] }
    120.0 = { operator = "GreaterThanOrEqualTo", contact_emails = [local.notification_email] }
  }
}

resource "azurerm_consumption_budget_resource_group" "cbrga" {
  name              = "${azurerm_resource_group.rg1.name}-m-actual-${var.budget_amount}"
  resource_group_id = azurerm_resource_group.rg1.id

  amount     = var.budget_amount
  time_grain = "Monthly"

  time_period {
    start_date = "2023-10-01T00:00:00Z"
  }

  filter {
    dimension {
      name = "ResourceId"
      values = [
        azurerm_monitor_action_group.mag1.id,
      ]
    }
  }

  dynamic "notification" {
    for_each = local.budget_actual_notifications
    content {
      enabled        = true
      threshold      = notification.key
      threshold_type = "Actual"
      operator       = notification.value.operator
      contact_emails = notification.value.contact_emails
      contact_groups = [azurerm_monitor_action_group.mag1.id]
      contact_roles  = ["Owner"]
    }
  }
}

Forecast

# Forecast

locals {

  budget_forecast_notifications = {
    80.0  = { operator = "GreaterThanOrEqualTo", contact_emails = [local.notification_email] }
    90.0  = { operator = "GreaterThanOrEqualTo", contact_emails = [local.notification_email] }
    100.0 = { operator = "GreaterThanOrEqualTo", contact_emails = [local.notification_email] }
    120.0 = { operator = "GreaterThanOrEqualTo", contact_emails = [local.notification_email] }
  }
}

resource "azurerm_consumption_budget_resource_group" "cbrgf" {
  name              = "${azurerm_resource_group.rg1.name}-m-forecast-${var.budget_amount}"
  resource_group_id = azurerm_resource_group.rg1.id

  amount     = var.budget_amount
  time_grain = "Monthly"

  time_period {
    start_date = "2023-10-01T00:00:00Z"
  }

  filter {
    dimension {
      name = "ResourceId"
      values = [
        azurerm_monitor_action_group.mag1.id,
      ]
    }
  }

  dynamic "notification" {
    for_each = local.budget_forecast_notifications
    content {
      enabled        = true
      threshold      = notification.key
      threshold_type = "Forecasted"
      operator       = notification.value.operator
      contact_emails = notification.value.contact_emails
      contact_groups = [azurerm_monitor_action_group.mag1.id]
      contact_roles  = ["Owner"]
    }
  }
}

Conclusion

Azure Budgets are a very handy and easy-to-implement setting which is native to the Azure ecosystem and should be a part of any FinOps strategy. With a few lines of code, it can save you from bitter surprise at the end of the month when costs have skyrocketed.

If you need help with implementing FinOps guardrails or you would like to save on your cloud spending get in touch.