Overview

Post-deployment patches make it possible for SREs to apply changes outside of the normal release cycles. These changes bypass the normal packaging and release cycle for expediency, when there is community impact on GitLab.com or a security vulnerability that needs immediate attention.

Every post deployment patch must have an associated S1 or S2 production incident

Patches do not live past a release on GitLab.com without an exception granted by the delivery team.

Patches are only preserved in the next omnibus upgrade, if the patch targeted the next version. Despite this capability, it is highly recommended that what was patched was incorporated into the next release so that they receive the same level of testing and go through the normal release pipeline.

Patches can only be issued for rails and sidekiq, after a patch is applied rails and sidekiq are issued a HUP signal. It is not easy to issue a patch for frontend code.

Patches are automatically applied to staging canary, and then manually gated for the other environments: staging, production-canary, and production. They can be applied in any order, but typically you should apply them in order: staging-canary, production-canary, staging, production. Ensure to try and keep staging and production in sync (either apply to both, or don’t apply to both).

The following components do not support post deployment patches:

  • Front-end code or assets
  • Gitaly
  • Workhorse
  • Nginx
  • Postgresql
  • Redis
  • Registry

Submitting a patch

The video walks through creating a post-deployment patch using the hotpatch Chatops command

Patches are initiated by the proposing backend team. There are two projects that are configured as mirrors for post-deployment patches:

  1. https://ops.gitlab.net/gitlab-com/engineering/patcher: This project is where patches are created for patching GitLab.com
  2. https://ops.gitlab.net/gitlab-com/gl-infra/patcher/: This is a push mirror of engineering/patcher where pipelines run for applying patches.

Backend Developer

  1. Ensure that an incident issue exists, if not engage the oncall on slack using @sre-oncall to create one.
  2. Initiate a hot patch by issuing the following Chatops command where incident number is the production incident number, ex: 1234.
/chatops run hotpatch --incident <incident number>
  • The only required option is --incident, in the unlikely situation where GitLab.com is down, any number can be used here as a placeholder.
  • The --package option can be used to override the package version, if GitLab.com is in a state where the version is not being reported correctly to Chef.
  1. Some links are provided in the output of the command to help prep the patch. Use the MR link provided to prep the patch in the engineering/patcher project
  2. Obtain the desired SHA revision number by clicking the patch directory for the relative environment. patch-direcotory
  3. Create a working branch in gitlab from the current SHA running on production (run /chatops run auto_deploy status to find out what this is or see the version from the output of Chatops hotpatch command as describe in (4.).
    • For example, you can do git checkout -b patch/my-fix 0bf60009bec if you want to derive your branch from 0bf60009bec revision based on the sha.
  4. Make your code changes
    • You can cherry-pick commits.
    • In that case remove changes to specs before generating the patch.
  5. Run the command git --no-pager diff --color=never 0bf60009bec.. -- . ':!spec' ':!ee/spec' > path/to/patch.patch
    • Note: this is an example - if you have changed non-spec files in other directories, be sure to include those
  6. Copy the patch file to corresponding placeholder directory for the environment that is being patched on the branch created by the Chatops hotpatch command earlier. example, to patch release 12.6.201912031517-0bf60009bec.8dfcd02384a copy the patch file(s) into https://ops.gitlab.net/gitlab-com/engineering/patcher/tree/master/patches/12.6.201912031517-0bf60009bec.8dfcd02384a
  7. If there are already patch files for the release, simply add them to the existing directory, the patcher tool will make sure all patches are applied.
  8. Patches are applied in sorted order. To specify the order prefix the patch with a sort key such as:
    • 001-patch-for-issue-1.patch
    • 002-patch-for-issue-2.patch
  9. When the merge request is ready inform the @sre-oncall in slack and in the merge request to review and merge the patch.

OnCall SRE or Release Manager

  • An SRE or Release Manager will merge the MR
  • To monitor the process of the patch deployment see the pipeline view of the patcher repository.
  • When the patch is deployed to staging-canary and verified, deploy to production canary using the gprd-cny-prepare manual task. When deployed and verified in production canary, deploy to staging and production using the gstg-prepare and gprd-prepare jobs respectively.

Determining whether a patch will be applied

There are three factors determining whether a patch will be applied that are checked prior to applying patches:

  1. Is there a rails service running on the node being patched?
  2. Does the patch release directory have an entry for the release that is running on the node being patched?
  3. Is the patch a change? Before actually applying the patch it is applied in dry-run mode to see whether the patch would be applied.

This information is displayed in the output of the Ansible run in the following task:

TASK [The following hosts will have the patch applied:] ************************
ok: [api-03-sv-gstg.c.gitlab-staging-1.internal] =>
  msg: FALSE | patch_dir_exists=True rails_exists=True patch_is_a_change=False

Submitting a patch adding or replacing a frontend asset

  • This process is manual and should not be done unless absolutely necessary

  • An SRE or Release Manager will need to drive patches for front-end changes that include any new assets because this process deviates from the normal patch procedure and requires direct access to the asset bucket

  • Before starting a front-end patch, leave a note on https://gitlab.com/gitlab-org/release/framework/issues/34 which is the issue that tracks automating this process

Upload asset files to object storage

GitLab staging, production and pre-production all use an asset bucket for keeping hashed asset files. This is the origin used by the CDN, and haproxy proxies to the bucket for all requests to /assets.

The first step for making a asset patch is to upload the new hashed asset to all buckets for all environments.

  • Change to the branch that has already been created for the patch by the developer prepping the fix, or create a new one that is based from the tag on production (visit https://gitlab.com/help to find out what this is).
  • Run the following to generate assets and manifest files, this is adapted from frontend.gitlab-ci.yml:
export NODE_ENV="production"
export RAILS_ENV="production"
export SETUP_DB="false"
export SKIP_STORAGE_VALIDATION="true"
export WEBPACK_REPORT="true"
export NODE_OPTIONS="--max_old_space_size=3584"
yarn install --frozen-lockfile --production --cache-folder .yarn-cache
bundle exec rake gitlab:assets:compile
  • Once this is complete copy the asset files to a temporary directory
tmpdir=$(mktemp -d)
mkdir -p "$tmpdir/assets"
cp public/assets/path/to/asset  $tmpdir/assets/path/to/asset
gsutil -h 'Cache-Control:public,max-age=31536000' rsync -a public-read -r "${tmpdir}/" "gs://gitlab-pre-assets"
gsutil -h 'Cache-Control:public,max-age=31536000' rsync -a public-read -r "${tmpdir}/" "gs://gitlab-gstg-assets"
gsutil -h 'Cache-Control:public,max-age=31536000' rsync -a public-read -r "${tmpdir}/" "gs://gitlab-gprd-assets"

Generate patches for the manifest file(s)

  • In addition to the new asset you will need the following two manifest files, this will depend on the type of asset that was updated. If the asset it located in the webpack/ directory you will need to patch the webpack manifest, otherwise you will need to update .sprockets-manifest-<hash>.json.
public/assets/webpack/manifest.json
public/assets/.sprockets-manifest-<hash>.json
  • Take the version that is currently on production and generate a patch file for the following, if necessary:
public/assets/webpack/manifest.json.old
public/assets/.sprockets-manifest-<hash>.json.old

diff -u public/assets/webpack/manifest.json.old public/assets/webpack/manifest.json  > manifest.json.patch
diff -u public/assets/.sprockets-manifest-<hash>.json.old  public/assets/.sprockets-manifest-<hash>.json > .sprockets-manifest-<hash>.json.patch

Once the asset has been uploaded and the patch file for the manifest has been generated it is possible to deploy the patch to staging. It is recommended that you do a manual staging deploy in the CI pipeline before merging. This is done by using the manual job on the branch pipeline.


Rolling back a patch

  • If a patch needs to be rolled back rename the patch file to have a .rollback extension. For example if the patch file is named patches/11.5.1-ee.0/profiles_helper.patch rename the file on a branch to patches/11.5.1-ee.0/profiles_helper.patch.rollback with the same content.
  • After the rollback file is created, contact a member of the delivery team to apply the rollback through the patcher pipeline.