Files
coder/.github/workflows/flake-go.yaml
T
Ethan 5088b5fa5f ci: extend flake-go bump test-count to 35 (#25981)
Two changes to make the `flake-go` workflow produce better signal when
something hangs or flakes at low rates.

**Job timeout (20m → 25m).** The Go-level `-timeout 20m` baked into
`make test` (`Makefile:1428`) currently raced the runner's 20m
hard-kill, so a hanging test got SIGTERM'd by Actions instead of
SIGQUIT'd by Go, and we never got the goroutine dump. Bumping the
workflow job to 25m mirrors the layering already used by `test-go-pg` in
`ci.yaml:409` and gives Go's timeout the 5m head start it expects.

**Test count (25 → 35).** Catches lower-frequency flakes that 25
attempts miss too often. For a 5% per-run flake, detection probability
goes from ~72% at n=25 to ~83% at n=35; for 1–2% flakes the lift is
larger. The longest successful flake-go run to date was 11m49s at n=25,
so n=35 should peak around ~16–17m and stay well inside the new 25m
budget.
2026-06-03 00:20:42 +10:00

92 lines
3.2 KiB
YAML

name: flake-go
on:
pull_request:
workflow_dispatch:
inputs:
base_sha:
description: "Base commit to diff against. Defaults to merge-base against origin/main."
required: false
type: string
head_sha:
description: "Head commit to analyze. Defaults to the checked out HEAD."
required: false
type: string
permissions:
contents: read
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
flake_go:
name: Flake Check
runs-on: ${{ github.repository_owner == 'coder' && 'depot-ubuntu-22.04-4' || 'ubuntu-latest' }}
# This timeout must be greater than the Go test timeout set in `make test`
# (-timeout 20m) so we receive a goroutine trace before the runner kills
# the job. Mirrors the test-go-pg job in ci.yaml.
timeout-minutes: 25
steps:
- name: Harden Runner
uses: step-security/harden-runner@f808768d1510423e83855289c910610ca9b43176 # v2.17.0
with:
egress-policy: audit
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
repository: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name || github.repository }}
ref: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.event.inputs.head_sha || github.sha }}
fetch-depth: 0
persist-credentials: false
- name: Set up Go
uses: ./.github/actions/setup-mise
with:
install-args: "go"
- name: Restore Go cache
uses: ./.github/actions/go-cache
- name: Install Go mise tools
run: ./.github/scripts/retry.sh -- mise install --locked go:github.com/coder/whichtests go:gotest.tools/gotestsum
- name: Select changed tests
id: selector
shell: bash
run: |
set -euo pipefail
whichtests \
--repo-root . \
--github-actions \
--coalesce \
--out-matrix "$RUNNER_TEMP/flake-matrix.json"
- name: Set up Terraform
if: ${{ fromJSON(steps.selector.outputs.matrix).include[0] != null }}
uses: ./.github/actions/setup-mise
with:
install-args: "terraform"
- name: Run targeted Go flake checks
id: flake_check
if: ${{ fromJSON(steps.selector.outputs.matrix).include[0] != null }}
uses: ./.github/actions/test-go-pg
with:
postgres-version: "13"
test-parallelism-packages: "4"
test-parallelism-tests: "16"
test-count: "35"
test-packages: ${{ fromJSON(steps.selector.outputs.matrix).include[0].package }}
run-regex: ${{ fromJSON(steps.selector.outputs.matrix).include[0].run_regex }}
test-shuffle: "on"
gotestsum-json-file: default
- name: Publish Go test failure report
if: failure() && steps.flake_check.outcome == 'failure' && github.actor != 'dependabot[bot]' && runner.os == 'Linux' && (github.event_name != 'pull_request' || !github.event.pull_request.head.repo.fork)
uses: ./.github/actions/go-test-failure-report
with:
artifact-name: go-test-failures-${{ github.job }}-${{ github.sha }}