Overview

The canary stage is a subset of fleet within the production and staging environment that can be deployed independently of the main environment.

  • When querying in prometheus, metrics for canary are labeled as stage=cny
  • Non-canary boxes are labeled as stage=main
  • The dashboard for canary has the current version as well as some canary specific metrics

The following backends support canary traffic:

  • API
  • HTTPS GIT
  • Registry
  • Web

By default, all web requests with the cookie gitlab_canary=true are directed to canary. In addition to this, certain request paths are sent, see the handbook page on canary for more information about what traffic is sent there and how to opt-in and opt-out.

Request paths for canary are configured in Chef for production: https://gitlab.com/gitlab-com/gl-infra/chef-repo/-/blob/master/roles/gprd-base-lb-fe-config.json#L88

HOW TO STOP ALL PRODUCTION TRAFFIC TO CANARY

Run the following in the #production Slack channel to disable the canary stage in production by draining and setting it into maintenance:

/chatops run canary --disable --production

Attention: Make sure there is no canary deployment ongoing at the same time! (we still need to add safety controls)

Note: Disabling canary will drain the canary servers, wait 60 seconds, and then set them into maintenance.

Once the servers are in maintenance it will allow the deployer pre-checks to pass and ensure the canary nodes are not re-enabled during subsequent deployments.

How to re-enable canary

Run the following command in the #production Slack channel to re-enable the canary fleet in production

/chatops run canary --enable --production

Canary ChatOps

The canary chatops command is the primary way to control canary traffic, it has the following options:

Controls canary traffic

Usage: canary [OPTIONS]

Options:

  -h, --help    Shows this help message
  --production  Control production canary traffic instead of staging
  --ready       Set canary to enable connections
  --enable      Set canary to enable connections (same as --ready)
  --drain       Set canary to drain connections
  --maint       Set canary to be in maint state
  --disable     Set canary to be disabled, drains and then sets maint state
  • When setting --{ready,drain,maint,disable,enable}, the status will be displayed to see the result of the change
  • Specifying a backend can limit canary to a subset of traffic if desired.
  • Don’t forget to use --production to target production instead of staging, if needed

The default canary option is to display the current connection status:

/chatops run canary --production

canary_api          : conn:2 UP:24
canary_ci_api       : conn:0 UP:6
canary_ci_https_git : conn:10 UP:6
canary_https_git    : conn:1 UP:24
canary_registry     : conn:0 UP:2
canary_web          : conn:22 UP:72
UP: web-cny-03-sv-gprd, web-cny-06-sv-gprd, web-cny-01-sv-gprd, web-cny-04-sv-gprd, web-cny-02-sv-gprd, web-cny-05-sv-gprd, git-cny-01-sv-gprd, git-cny-02-sv-gprd, api-cny-01-sv-gprd, api-cny-02-sv-gprd, gke-cny-registry

The above status shows the number of connections for each backend and the number of servers reporting UP. Note that there many haproxy servers so the number of servers reporting for each backend is multiplied by the number of lbs.