Resolving QA failures
Overview
This document describes the process to follow if QA tests are failing.
QA smoke tests are run as part of the auto-deploy pipeline - this means they are run regularly and can be assumed as stable.
Process Overview
Failing QA tests are always tracked using an issue in the release tracker to give a record of the failure. This is a change away from incidents as part of https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/752
Quality are here to help us investigate and resolve any failures and maintain a schedule to show who to ping. All test results are posted in the #qa-
Process steps
- Follow the steps in the (Handling Deploy Failures runbook)[https://gitlab.com/gitlab-org/release/docs/-/blob/master/general/deploy/failures.md#handling-deployment-failures] to create an issue in the release issue tracker
- Check the
qa-<env>
Slack channel, the failure may already be known and being worked on - To escalate, simply ping the engineer listed in the Quality on call schedule and ask for assistance on the issue. The #quality Slack channel can also be used.
- If the failure appears to be code or environment-related, declare an incident with the correct (availability severity)[https://about.gitlab.com/handbook/engineering/quality/issue-triage/#availability] and (Delivery impact)[https://about.gitlab.com/handbook/engineering/releases/#delivery-impact-labels] label. Link to the release issue created in step 1.
- Once tests are passing again update the issue with a summary of the failure. Apply the
deploys-blocked-gprd::*
anddeploys-blocked-gstg::*
labels and close the issue
Quality on call
Quality maintain a schedule of engineers on call to assist us. See their responsibilities.