Silence Access Control Lists

Intro

Karma provides ability to setup ACLs for silences created by users. This can be used to limit what kind of silences each user is allowed to create, which can help to avoid, for example, Team A accidentally silencing alerts for Team B, or blocking engineering team from creating any silence at all, leaving that ability only to the sys admin / SRE team.

Example Alertmanager silence:

{
  "matchers": [
    {
      "name": "alertname",
      "value": "Test Alert",
      "isRegex": false
    },
    {
      "name": "cluster",
      "value": "prod",
      "isRegex": false
    },
    {
      "name": "instance",
      "value": "server1",
      "isRegex": false
    }
  ],
  "startsAt": "2020-03-09T20:11:00.000Z",
  "endsAt": "2020-03-09T21:11:00.000Z",
  "createdBy": "[email protected]",
  "comment": "Silence Test Alert on server1"
}

It would be applied to all alerts with name Test Alert and where label cluster is equal to prod. An ACL rule could be used to restrict silence creation based on matched labels, so for example only selected users would be allowed to silence this specific alert.

Requirements

For ACLs to work a few configuring options are required:

Optional configuration:

Regex silences

Alertmanager silences allow to use regex rules which can make it tricky to apply ACLs to those silences.

Silence example using regex:

{
  "matchers": [
    {
      "name": "alertname",
      "value": "Test Alert",
      "isRegex": false
    },
    {
      "name": "cluster",
      "value": "staging|prod",
      "isRegex": true
    }
  ],
  "startsAt": "2020-03-09T20:11:00.000Z",
  "endsAt": "2020-03-09T21:11:00.000Z",
  "createdBy": "[email protected]",
  "comment": "Silence Test Alert in staging & prod cluster"
}

The difference compared to the previous example is that the cluster label is now matched using staging|prod regex, so any alert with cluster label equal to staging or prod will be matched. This is a simple example, regexes allow to create very complex matching rules.

The effect on ACL rules can be illustrated with this example: let’s say we have a group that should never be allowed to create any silence for prod cluster, so a silence like the one below should be blocked:

{
  "matchers": [
    {
      "name": "alertname",
      "value": "Test Alert",
      "isRegex": false
    },
    {
      "name": "cluster",
      "value": "prod",
      "isRegex": false
    }
  ],
  "startsAt": "2020-03-09T20:11:00.000Z",
  "endsAt": "2020-03-09T21:11:00.000Z",
  "createdBy": "[email protected]",
  "comment": "Silence Test Alert in prod cluster"
}

But if we would create an ACL rule that simply blocks silences with matcher:

{
  "name": "cluster",
  "value": "prod",
  "isRegex": false
}

then any user could bypass that with a regex matcher like:

{
  "name": "cluster",
  "value": "pro[d]",
  "isRegex": true
}

Because of that it is highly recommended to block regex silences, which can be done with an ACL rule. Since rules are evaluated in the order they are listed in the config file it is best to set this as the very first rule. See examples below to learn how to block regex silences.

Configuration syntax

Rule syntax:

action: string
reason: string
scope:
  groups: list of strings
  alertmanagers: list of strings
  filters: list of filters
matchers:
  required: list of silence matchers

Examples

Block silences using regex matchers

This rule will match all silences with any matcher using regexes (isRegex: true on the matcher) and block it.

rules:
  - action: block
    reason: all regex silences are blocked, use only concrete label names and values
    scope:
      filters:
        - name_re: .+
          value_re: .+
          isRegex: true

Allow group to create any silence

rules:
  - action: allow
    reason: admins are allowed
    scope:
      groups:
        - admins

Allow only admins group to create silences with cluster=prod

First allow all members of the admins group to create any silence, then block silences with cluster=prod. Since ACL rules are evaluated in the order specified and first allow or block rule stops other rule processing this will allow admins to create cluster=prod silences while everyone else is blocked from it. Disabling regex rules as first steps prevents users from bypassing those ACLs with regex silences.

rules:
  - action: block
    reason: all regex silences are blocked, use only concrete label names and values
    scope:
      filters:
        - name_re: .+
          value_re: .+
          isRegex: true
  - action: allow
    reason: admins are allowed
    scope:
      groups:
        - admins
  - action: block
    reason: only admins can create silences with cluster=prod
    scope:
      filters:
        - name: cluster
          value: prod

Require postgresAdmins group to always specify db=postgres in silences

Block postgresAdmins members from creating silences unless they add db=postgres to the list of matchers.

rules:
  - action: requireMatcher
    reason: postgres admins must add db=postgres to all silences
    scope:
      groups:
        - postgresAdmins
    matchers:
      required:
        - name: db
          value: postgres

Require devTeam group to specify instance=server1-3

Block devTeam members from creating silences unless they target one of the servers they own.

rules:
  - action: requireMatcher
    reason: devTeam can only silence owned servers
    scope:
      groups:
        - devTeam
    matchers:
      required:
        - name: instance
          value_re: server[1-3]

Require everyone to always specify team matcher in silences

Block anyone from creating silences unless they add team matcher with some value.

rules:
  - action: requireMatcher
    reason: team label is required for all silences
    matchers:
      required:
        - name: team
          value_re: .+